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- morpheme boundary INTJ interjection 
= clitic boundary LOC locative 
syllable boundary LV locative voice 
<> infix MED medial 
= ungrammatical form N nasal 
Py phonological representation NEG negation 
[ ] phonetic representation NMLZ nominalizer 
<> in text: orthographic NP noun phrase 
representation NR numeral 
in examples: phonetic NRLS  nonrealis 
boundary P plural 
o syllable PE plural exclusive 
> link to the audio file! PI plural inclusive 
1 first person PN personal name 
2 second person POT potentive 
3 third person PP prepositional phrase 
ACT actor PRX proximative 
AND  andative QUOT quotative 
APPL applicative RCP reciprocal 
AV actor voice RDP reduplication 
COLL collective REL relative pronoun 
CPL completive RLS realis 
DIST distal S singular 
EXIST existential SF stem former 
GEN genitive ST stative 
GER gerund UV undergoer voice 
HON honorific article VEN  venitive 


INCPL incompletive 


‘Audio files corresponding to the referenced examples can be found on the Open Science Frame- 
work (OSF) platform (Bracks 2023). By clicking on the ‘>’ symbol, you will be redirected 
through a hyperlink to access the audio files associated with each specific example. 


1 Introduction 


The present book is an investigation into aspects of Totoli’s prosody, intonation 
and the prosody-syntax interface. Totoli is an endangered Austronesian language 
of the Malayo-Polynesian group and this book is the first study of the intonation 
of Totoli and among the few investigations into the prosody and intonation of 
Austronesian languages in general. The investigation seeks to uphold maximal 
ecological validity (Cicourel 2007). To this end, the analysis is based on an exten- 
sive corpus of natural (semi-)spontaneous speech which is accessible through the 
Language Archive Cologne (Bracks et al. 2023). The study takes the prime struc- 
turing unit of speech - the Intonation Unit (IU) - as its principal unit of inves- 
tigation and presents a thorough description of the IU, develops an intonational 
model thereof and investigates the syntactic units it contains. The proposed in- 
tonational model is supported by experimental evidence of both production and 
perception. 

The results of the various approaches taken in this book show that Totoli falls 
under the category of Phrase Languages (Féry 2016). From what is known so 
far, Totoli shows no evidence of tonal specifications at the level of the word; the 
language does not make use of word stress nor of lexical tone. Prosodic promi- 
nence does not play a role in the marking of information-structural categories, 
and tonal specifications are assigned exclusively at the level of the intonational 
phrase and are associated with their right-edge boundary. Based on an investi- 
gation of tonal specifications and syntactic content of prosodic units of Totoli, I 
show that the data is best analyzed by assuming recursive embedding of IUs into 
Compound Intonation Units (CIU). 

When working with un(der)researched languages, one faces the task of finding 
appropriate tools for tapping into the prosodic system of the language. Himmel- 
mann (2006) and Himmelmann & Ladd (2008) argue that the study of the prosody 
of a language should best be supported with evidence obtained through different 
approaches. At best, it should contain an investigation of a substantial corpus of 
(semi-)spontaneous speech, the analysis of which is computer-aided and comple- 
mented with experimental evidence from production and perception. 

The study presented here follows this approach. It is a combination of quanti- 
tative and qualitative analyses and is based on an extensive dataset collected by 
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the author in the course of a language documentation project (Bracks et al. 2023). 
With a strongly data-driven approach, the study integrates a combination of ex- 
perimental evidence from both production and perception with corpus-based 
evidence through descriptive and inferential statistics. The data used for this re- 
search was collected within the Collaborative Research Center 1252 “Prominence 
in Language”, funded by the DFG (German Research Foundation), and was sup- 
ported by the Ministry of Research and Technology of the Republic of Indonesia 
through the provision of a research permit.! During several field trips to Totoli 
from 2017-2019, 196 hours of video material of various genres were collected, in- 
cluding a 56-hour Child Language Corpus, 85 hours of elicitation recordings, and 
31 hours of (semi-)spontaneous speech. At least 20 hours of these recordings are 
transcribed. The subset used in this study consists of 2h 19min of recordings of 
(semi-)spontaneous/naturalistic speech which will be further described and dis- 
cussed in §3.1.1. It is essentially an extension of the first language documentation 
corpus (Leto et al. 2005-2010) and follows its glossing conventions and grammat- 
ical analysis. 

I make a number of analytical proposals which are relevant to prosodic the- 
ory and typology in general. This research represents a significant advancement 
in our understanding of the nature of prosodic systems found in (Western) Aus- 
tronesian languages and intonational systems in general. Additionally, the study 
adheres to the principle that research should be reproducible. Thus, all data is ex- 
plicitly referenced in the text and made available at an online repository (Bracks 
et al. 2023). Furthermore, examples from the corpus in this book are represented 
by periograms, which utilize automatically smoothed and interpolated pitch con- 
tours that are enriched with periodic energy. Periograms are thus phonetic repre- 
sentations that modulate pitch trajectories with periodic energy, by integrating 
“relevant acoustic cues into a perceptually motivated representation of the pitch 
contour of an utterance” (Albert et al. 2018: 807). I followed Albert et al.’s (2022) 
workflow but modified the color code. Throughout this book, pitch curves are 
displayed as yellow lines overlaying blue lines that represent information about 
periodic energy, as indicated by modifications in the transparency and width of 
the line. Syllable boundaries in the periograms are represented by thin, gray lines, 
while thicker lines are used to indicate word boundaries. 

Audio for all examples is provided alongside this book and is indicated by the 
“>”-sign. The first line (in italics) gives the phonemic transcription, disregarding 
allophonic realizations. This includes the particular case of word-final /1/ and its 
allophonic realization as a length-feature in word-final position in a process of 
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word-final compensatory lengthening (see §1.2.2). The examples include a sec- 
ond line with the segmentable morphemes separated by hyphens. The third line 
contains translations and abbreviated grammatical category labels in small capi- 
tals, and the fourth line provides a free translation to English. Information on the 
files of recordings in the Totoli archive (Bracks et al. 2023) is given in a final line. 
The examples from Totoli in this book follow the Leipzig Glossing Rules (Haspel- 
math 2015) and all glosses, abbreviations and other symbols used are explained 
above. 

The primary objective of the introductory section of this book is to provide an 
overview and essential information necessary to comprehend the main discus- 
sions of this book, which are presented in Chapters 2 and 3. These two chapters 
employ two distinct approaches to studying the prosody and intonation of Totoli, 
with Chapter 2 concentrating on experimental methods. The results of this chap- 
ter are subsequently integrated into the analysis presented in Chapter 3, which 
is based on corpus-based evidence and employs Intonation Units (IUs) as the 
primary unit of analysis. Chapter 3 is divided into three sections: §3.1 describes 
fundamental properties of IUs in the corpus, §3.2 develops an intonational model 
of IUs based on boundary tone events and the findings of the experiments pre- 
sented in Chapter 2, and §3.3 examines the syntactic content of IUs and complex 
Compound IUs. Finally, Chapter 4 summarizes the results, explores their impli- 
cations, and suggests future research avenues. 


1.1 Information on the Totoli language 


Totoli is an Austronesian language spoken in the Tolitoli regency (Kabupaten 
Tolitoli) of the Central Sulawesi province (Sulawesi Tengah) on the Indonesian 
island of Sulawesi. The linguistic area is divided into a southern region, primar- 
ily comprising the city of Tolitoli and surrounding villages, and a northern area 
consisting of the villages of Diule, Pinjan, Binontoan, Gio, and Lakuan (Tolitoli). 
Figure 1.1 shows the area where Totoli is spoken. The two linguistic areas are 
encircled. 

Himmelmann (1991: 18) calculates the ethnic population of the Totoli people to 
be approximately 25,000 people, but estimates that only about 30% of them - ca. 
7,000 — are fluent speakers. Leto et al. (2005-2010) estimate a maximum of 5,000 
fluent speakers. This number may have further declined over the last decade. In 
the city of Tolitoli and other villages in the southern linguistic region, it is safe 
to say that Totoli is never heard on the streets, and the everyday language of 
ethnic Totolis is Indonesian. Even in Totoli households, families speak almost ex- 
clusively in Indonesian. In contrast to the southern area, Totoli is more resilient 
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Lantapan 
P e Lakatan 
eOgomoli 


Figure 1.1: Area where Totoli is spoken, adapted from Himmelmann 
(1991: 29) 


in the five northern villages and is, to varying degrees, still a language of ev- 
eryday communication. In these villages, the great majority of ethnic Totolis are 
fluent speakers of Totoli, although in many areas of everyday life, Indonesian is 
the preferred language. Children are raised entirely in Indonesian and they are 
not actively taught Totoli by their parents. The only exception is the village of 
Gio. Interestingly, Himmelmann (1991: 28) lists the village of Gio as a Dondo- 
speaking settlement belonging to the village of Binontoan. Many Totoli families 
from Binontoan and Lakuan (Tolitoli) have moved to Gio since the 1990s, which 
resulted in a growth in the population of the village. As of 2019, 839 inhabitants 
live in 189 households in Gio (BPS Tolitoli 2019: 7). Although it was formerly a 
Dondo-speaking settlement, Dondo is now almost never heard there. Some of 
the original Dondo-speaking inhabitants still speak the language; their children, 
however, speak Totoli. It is now the only village where Totoli is the preferred 
language in almost all domains, and it is the only place where infants are reared 
almost exclusively in Totoli. Furthermore, it is now considered by other Totolis as 
the stronghold of the Totoli language, culture and music, and musicians from Gio 
practicing the verbal art of Lelegesan (Riesberg 2019, Bracks & Moss 2022) are 
frequently invited to perform in other Totoli-speaking villages, as well as in the 
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city of Totoli. The community has become increasingly aware of the endangered 
state of their language. Some young speakers have successfully promoted Totoli 
through short films and other content on social media and streaming platforms. 
Furthermore, the mayor of Binontoan village has established an improvised TV 
channel that primarily focuses on topics related to Totoli, such as music and fes- 
tivities. The channel features recordings of recent events captured on cellphones 
and other devices. 

Totoli is primarily a spoken language and is rarely used in written communi- 
cation, resulting in the absence of a standardized orthography. However, some 
community members occasionally write in Totoli on social media or cellphone 
messenger apps, using the orthography of the Indonesian language. In this book, 
examples from Totoli are presented in phonemic transcription. 


1.2 Aspects of the segmental phonology of Totoli 


As a necessary precursor to the subsequent chapters, I provide a brief descrip- 
tion of the fundamental aspects of Totoli’s segmental phonology relevant to this 
study. For a more detailed description, consult Bracks (submitted). The main fo- 
cus here is on the phoneme inventory (§1.2.1), along with a brief commentary on 
phonotactics and general patterns of word structure (§1.2.3). Additionally, the 
topic of vowel length and related processes in Totoli is explored in greater de- 
tail (§1.2.2), as it is pertinent to the ensuing exposition of Totoli’s prosody and 
intonation. 


1.2.1 Phoneme inventory 


The phoneme inventory of Totoli consists of 18 consonants and 5 vowels. Seven 
consonant phonemes have been introduced through loanwords, mainly from In- 
donesian and Arabic. 

The consonant phonemes are shown in Table 1.1, with the 7 marginal phonemes 
indicated in brackets. 

The vowel phonemes are shown in Table 1.2. 

The degree of allophonic variation in phoneme realization is generally limited, 
except for the phoneme /1/. The following section on vowel length contains a 
detailed explanation of this exception, as it is of importance to this study. 
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Table 1.1: Consonant phonemes of Totoli 


bilabial alveolar palatal velar glottal 


stop p b t d A) (4) k g 
fricative S (h) 
nasal m n (n) 1) 

lateral l 

trill (r) 

approximant (w) (j) 


Table 1.2: Vowel phonemes of Totoli 


front central back 


close i u 
mid E 2 
open a 


1.2.2 Vowel length 


Totoli phonetically distinguishes between long and short vowels. Both can occur 
word-initially, word-medially and word-finally. Some lexical roots inherently in- 
volve a long vowel and other long vowels are the result of affixation. 

Table 1.3 gives examples of long vowels in each position as they occur in roots 
and through affixation. 

In addition to the above, final long vowels can occur through a process of 
compensatory lengthening. The lateral phoneme /1/ has three allophonic realiza- 
tions: [1] after front vowels, [I] after back vowels, and a length feature on the 
preceding vowel in word-final position. Table 1.4 shows examples for the differ- 
ent allophonic realizations of /1/: 

Evidence for analyzing final lengthening as an allophone of /1/ comes from 
the “reappearance” of [I] or [l] when suffixes are added to such bases. Note that 
when clitics are added, no [J] or [1] appears and the vowel remains lengthened. 
Three examples from the corpus are given in (1)-(2). 

In example (1), the base sumbol ‘life’ occurs unsuffixed. The phoneme /1/ is 
realized in its word-final allophone as a length feature on the preceding vowel. 
In example (2), the same base is followed by the enclitic =mo ‘crr’, so /1/ is also 
realized as a length feature on the preceding vowel. In example (3), the same base 
is followed by a suffix with initial vowel /a/ and hence the /1/ is realized as [I]. 
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Table 1.3: Bases with, or processes leading to long vowels 


POSITION OF TRANSLATION/ 


EXAMPLE 

LONG VOWEL GLOSS 

, tikəə A VA ‘neck’ 

Lexical bases de < , 

lesa tiin SV # hear 
WI c . ` 
8 zen AV: A saliva 
moN-sanki-i 


Long vowels 
through affixation 


[1] after front vowels 


[I] after back vowels 


manankii #__V:# 


kulobaanai # V: # 


litaanna 


AV-Carry-APPL 
ku=loba-an =ai 

1s.ACT=inform-APPL=VEN 
i-ita-an=na 

RLS-See-UV=3S.GEN 


Table 1.4: Allophones of lateral /1/ 


compensatory lengthening 
in word-final position 
CV[1]+ — CV:# 


(1) 


(2) 


[mosumbo: ana] 
mo-sumbol ana 
AV-live MED 
“(it is) alive’ 
[nasumba:ma] 
no-sumb»l-m2 


AV.RLS-grOW-CPL 


‘alive’ 


lelean 
siisiligna 
bale 


nolumulas 
tuutulu 


ampil 
monondol 


[lelean] ‘bridge’ 

[si:silrgna] ‘looking at him/her’ 
[bale] ‘house’ 

Inolumulas| ‘scatter’ 

[tu:tulu] ‘sleeping’ 

[ampi:] ‘side/twin’ 


[monondzo:] ‘regret’ 


(lifestory_RDA_ 1.160) > 


(monkey_turtle.130) > 
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(3) [nasumbolangai] 
no-sumbol-an=ga=ai 
AV.RLS-grOW-APPL=?=VEN 


‘they grow again’ (tau_bentee.033) > 


The regular omission of final laterals in loanwords provides further support 
for this analysis: Malay kapal ‘ship’ > Totoli [kapa:] . Throughout this work, the 
first line of examples gives a phonological representation of the examples. Hence, 
the lateral /1/ in final position is represented as /l/ but phonetically realized as a 
length feature. 

The case of word-final long vowels is important for the subsequent discussion 
of tonal patterns, presented in Chapter 3. I show that right-edge boundary-tone 
complexes are usually associated with the final and the prefinal syllable of an IU. 
If the final syllable involves a long vowel, the tonal pattern is realized on the final 
long syllable exclusively. This is illustrated and discussed in §3.2.3, examples (13) 
and (14). 


1.2.3 Phonotactics and word structure 


Most words in Totoli follow a strict CV-pattern. Consonant clusters are rare with 
the exception of homorganic nasal stop clusters, a common phenomenon in lan- 
guages that otherwise exhibit a rather strict CV-structure (Downing & Mtenje 
2017, Downing 2005, Reid 2000, Riehl 2008, Herbert 1986). In Totoli, such se- 
quences occur word-initially and word-medially but not in word-final position. 
Frequently they arise from a process commonly known as “nasal substitution” 
in the Austronesianist literature (Blust 2004, Pater 2004). In the examples, “nasal 
substitution” is represented by a capital N on the second line. Furthermore, To- 
toli makes use of geminates, which occur word-initially and word-medially, but 
not in word-final position. Some lexical roots involve geminates but frequently 
result from reduction processes of CN. CN. sequences whereby the first vowel 
is dropped, yielding C,C,.V,. Other heterorganic consonant sequences are very 
rare. Only few lexical bases involve such consonant sequences. Across clitic- 
boundaries, however, they are allowed but are also very infrequent. Another 
major morphophonological process in Totoli is vowel harmony in prefixation. 
It is always regressive, being restricted to prefixes containing the vowel /o/ in 
their citation form, such as maN-, naN-, mo-, no-, mag-, nag-, po-, pag-, and ko=. 
The vowels of these prefixes occur as /9/ when they precede bases containing 
/9/, /u/, or /i/ in their first syllable. However, when the first syllable of the base 
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contains /e/, the prefix vowel is realized as /z/, and when it contains /a/, the prefix 
vowel is realized as /a/. 

Additionally, reduplication is a common morphophonological process in both 
verbal and nominal morphology in Totoli. This process encompasses various 
forms, all of which are represented by a single label, RDP, in the glossing of ex- 
amples. 

The aforementioned discussion provides a concise overview of the fundamen- 
tal aspects of segmental phonology necessary for comprehending the discussion 
on Totoli’s prosody and intonation in Chapters 2 and 3. 


1.3 Research on the prosody of Austronesian languages 


Little is known so far about the prosody of Austronesian languages, a fact also ac- 
knowledged by Himmelmann & Kaufman’s (2020) chapter on the state of the re- 
search on the prosody and intonation of Austronesian languages. Himmelmann 
(2018: 348) proposes a model of the basic structure of the Intonation Unit (here- 
after: IU) in Austronesian languages of Indonesia and East Timor. More thorough 
phonetic studies that have been conducted on Austronesian languages of Indone- 
sia in recent years suggest that many languages in the area may lack word-level 
prominence and that tonal targets are primarily assigned at the phrase level. In- 
donesian/Malay as one of the major languages in the region has stirred debate 
about “stress” placement and its existence (for a summary see Goedemans & van 
Zanten 2007: 28-9). For Indonesian, as well as for many other Austronesian lan- 
guages, the position of word stress is often claimed to be on the penultimate 
syllable of a word. Analyzing this claim on the assumption that speakers of In- 
donesian as a second language show a strong L1 influence, Goedemans & van 
Zanten (2007: 42) compared Indonesian spoken by Toba Batak speakers with that 
of Javanese speakers. They found that Toba Batak speakers produce the penulti- 
mate syllable of IU-final words in focus condition with higher intensity, longer 
duration and a rise in FO. Speakers of Indonesian with Javanese as their first lan- 
guage, however, produce the words in the same condition only with a rise in FO, 
whilst duration and intensity are not affected. They conclude that Indonesian 
spoken by Toba Batak speakers exhibits prominence on the level of the word 
as well as the phrase. For speakers with a Javanese background, however, they 
“only found evidence for prominence at the phrase level (in the form of pitch 
movements)” (Goedemans & van Zanten 2007: 45). The results found for the In- 
donesian of Javanese L1 speakers are in fact similar to what has been reported 
about the Indonesian/Malay variety Ambonese Malay, spoken in Eastern Indone- 
sia on the Maluku Islands. Analyzing IU-final FO movements in Ambon Malay, 
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Maskikit-Essed & Gussenhoven (2016: 382) found no association of the timing 
of IU-final boundary-tone complexes with any syllable. Moreover, focus condi- 
tion did not reveal any systematic effect on the range and shape of the pitch 
on IU-final words. Hence the authors opt for analyzing IU-final tone complexes 
as floating boundary tones, since such an analysis assumes neither word stress 
nor pitch accent, whether associated lexically or postlexically (Maskikit-Essed & 
Gussenhoven 2016: 356). They conclude that IU-final boundary-tone complexes 
may instead signal the function of sentences. The absence of word prosody and 
the assignment of tone complexes to boundaries of prosodic domains fit the char- 
acteristics of what Féry (2016) labels Phrase Languages. 

In addition to the studies of phonetic correlates of stress in Austronesian lan- 
guages, only a small number of analyses of the intonation of Austronesian lan- 
guages exist: Himmelmann (2010) presents a description of the intonation of 
Waima’a, spoken in East Timor; Maskikit-Essed & Gussenhoven (2016) describe 
the two most common IU-final pitch melodies of Ambonese Malay; Stoel (2006) 
proposes a concise description of the intonation of Banyumas Javanese. These 
studies are based on a set of target phrases or question-answer pairs, the real- 
ization of which has been taken as generalizable over the intonational system of 
the language as a whole. Such an approach may be suitable for the description of 
the major aspects of the intonation of a language. However, the frequency distri- 
bution of patterns and also less frequently used intonation patterns may only be 
observed in a corpus study, covering different communicative events. Possibly 
the only study conducted on the intonation of an Indonesian language which 
is primarily based on a corpus of spoken spontaneous discourse is that of Stoel 
(2005) on Manado Malay. 


1.4 The units of spoken speech 


This research is based on the analysis of a large corpus of speech. The first hurdle 
one faces when dealing with corpora of (semi-)spontaneous speech is the iden- 
tification and segmentation of the data into tangible units (Himmelmann 2006, 
Edwards & Lampter 1993). Speech can be segmented into various units of differ- 
ent sizes, though most studies recognize the Intonation Unit (IU) (Chafe 1994) as 
the basic unit into which discourse and the flow of speech is structured. 

The IU has been discussed under a variety of other names such as the Tone 
Group (Halliday 1967), the Tone Unit (Crystal 1976), the Intonation/Intonational 
Phrase (Selkirk 1986, Nespor & Vogel 1983, Pierrehumbert 1980, Ladd 2008, Gussen- 
hoven & Chen 2020, Jun 2005c, 2014a), and the Breath Group (Lieberman 1966, 
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Lieberman et al. 1970). Details of the definition of the various terms vary. Under- 
lying this study is the basic definition of the IU by Chafe (1987: 22): 


An intonation unit is a sequence of words combined under a single, coherent 
intonation contour, usually preceded by a pause. 


The coherent intonation contour is the defining characteristic of an Intona- 
tion Unit. A number of features have been identified which contribute to the 
perceived single, coherent intonation contour. Other features, on the other hand, 
delimit a speech segment and indicate the boundary of an IU. The criteria dis- 
cussed pertain mainly to pitch, rhythm, and voice quality, but non-prosodic fea- 
tures have also been identified, such as the end of a turn/the change of speaker, 
inhalation, and lexical boundary markers (Schuetze-Coburn 1994: 93-155, Him- 
melmann 2006: 260-270, Du Bois et al. 1992, 1993, and Cruttenden 1997: 29-39). 

Tao (1996: 52) mentions that discourse particles also proved a reliable criterion 
for the identification of IU boundaries in Mandarin Chinese, as they correlate 
highly with IU boundaries. Strictly speaking, however, prosodic clitics are a syn- 
tactic criterion and, as such, should not be used to identify any prosodic unit 
(see the discussion in §3.3). A single IU-boundary feature alone does not suffice 
to reliably detect an IU boundary, and hardly any IU exhibits all of the boundary 
cues: 


The relative importance of the cues may differ — pitch reset, for example, 
is arguably more central than tempo modulation — but none alone defines 
an IU boundary per se; rather, a conjunction of cues is usually required 
for an IU to be perceived. One can say that the prototypical IU exhibits 
all of these cues, yet seldom are all actually present in any given instance. 
That is, most IUs deviate from the prototype to some degree. Thus, a given 
IU may exhibit pitch reset and a definite contour, but none of the other 
features. (Schuetze-Coburn et al. 1991: 217) 


Hence, the IU is defined in terms of a prototype and “the more features that 
coalesce at any point, the stronger (‘more prototypical’) the boundary will be, 
but an IU boundary may also be perceived when only one or two features occur” 
(Schuetze-Coburn et al. 1991: 227). 

Many discourse-oriented linguists report on difficulties with identifying IU 
boundaries and comment on the sometimes tedious nature of the task. While 
Brown et al. (2015: 46) report “constant difficulty in identifying tone groups in 
spontaneous speech”, a great many other linguists working with discourse data 
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admit that, no matter how difficult the task, with “practice and appropriate guid- 
ance, however, one should be able to attain a reasonably high degree of inter- 
transcriber reliability” (Du Bois et al. 1992: 112). On that matter, Chafe (1994) 
comments that “in a better world they would be as important a part of the train- 
ing of a linguist as the ability to transcribe vowels and consonants” (see also 
comments from Schuetze-Coburn 1994: 165, Crystal 1976: 206, and Cruttenden 
1997: 29). Based on his hands-on experience of working with various corpora, 
Himmelmann (2006: 261) reports that an estimated proportion of 80-90% of IUs 
are rather unproblematically identifiable, which also reflects my own experience. 

Despite the many remarks about the difficult nature of the transcription pro- 
cess, studies have shown that even naive, untrained speakers perform remark- 
ably well in segmenting discourse. Kreckel (1981) conducted an experiment in 
which untrained, native English-speaking, participants were presented with a 
written transcript and a corresponding (English) audio recording. Participants 
were asked to mark ‘message’ boundaries on the transcript. The results showed 
that the participants segmented speech into Intonation Units (i.e. ‘tone groups’) 
with a high degree of interrater agreement. Furthermore, participants gave pri- 
ority to prosodic cues over syntactic ones. 

In recent years, these findings have been confirmed by a number of studies 
using the Rapid Prosody Transcription method (RPT; Cole & Shattuck-Hufnagel 
2016). Its boundary-marking task is similar to the method used by Kreckel (1981), 
and the results obtained from a number of typologically unrelated languages 
show a high degree of interrater agreement on the placement of boundaries. In 
§2.1.5, I give an overview on agreement results from different studies and com- 
pare these to the results obtained from an RPT study of Totoli (cf. Figure 2.5). 
However, results from the boundary-marking task obtained from RPT experi- 
ments are usually not discussed as evidence for the perception of IU boundaries. 
Yet, in §3.1.3, I correlate the results from the boundary-rating task of the RPT 
with IU boundaries which occur within rated speech segments. The results show 
that naive listeners can indeed reliably identify IU boundaries. While the univer- 
sality of prosodic units below the IU are subject to debate (Bickel & Zuniga 2017, 
Schiering et al. 2010), the IU is widely accepted. 

The Intonation Unit as a discourse-structuring unit has been successfully em- 
ployed by studies on a variety of typologically unrelated languages. If prosodic 
cues that delimit IUs are similar across languages, then listeners should be able 
to identify IU boundaries even in languages they are not familiar with. In this re- 
gard, Ford & Thompson (1996: 174) briefly commented that trained transcribers 
can reliably identify IU boundaries in an unknown language with a precision 
of 85-90%. This observation has been put to the test only by Himmelmann et al. 
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(2018), which investigated the inter-transcriber agreement of IU segmentation 
done by trained transcribers on familiar and unfamiliar languages from different 
language families (German, Papuan Malay, Wooi, Yali). The results showed statis- 
tically significant inter-transcriber agreement on the placement of IU boundaries, 
which led the authors to postulate the Universal Phonetic IP Hypothesis (UPIPH). 
The hypothesis claims 


[...] that all natural languages make use of the same kinds of phonetic cues 
for IPs, and that these cues can be perceived by speaker-hearers even in 
unfamiliar languages. [...] We believe that it is quite likely that phonologi- 
cal IPs are part of the prosodic system of all natural languages. If this is the 
case, IPs would be a prime example of a universally attested phonological 
category. (Himmelmann et al. 2018: 239-240) 


The UPIPH is a strong claim and further data from different languages is 
needed to substantiate it. Furthermore, an investigation into the comparison of 
the various cues for IP boundaries may yield interesting cross-linguistic simi- 
larities and/or differences. However, with supporting evidence from a variety 
of languages, it appears that all speakers organize their speech into Intonation 
Units, which are perceived as such by the listener. 

As will become evident from the analysis of tonal patterns of segmented IUs of 
the corpus in §3.2, we have to assume recursive embedding of Intonation Units 
into Compound Intonation Units in Totoli. While some segmentable stretches 
of speech of the corpus occur as simple, singleton IUs, others occur as complex, 
Compound IUs, all of which are subsumed under the label CIU. 
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prosody of Totoli 


Himmelmann & Ladd (2008: 250) summarize that sentence-level prosody is typi- 
cally employed in marking sentence modality, phrasing, and prominence. In this 
chapter, I present experimental evidence that focuses particularly, though not 
exclusively, on the latter, ie. the role of prominence in the prosody of Totoli. 
The chapter consists of two sections. In the first section, I present the results of a 
Rapid Prosody Transcription (RPT) experiment (§2.1). This setup has proven par- 
ticularly useful for obtaining a first impression of the prosody of a little-known 
language. The study’s results are complemented by two focus marking experi- 
ments, which constitute the second section of this chapter (§2.2). This section ex- 
plores whether prosodic prominence is used to mark the information-structural 
category focus. To ensure adherence to the fundamental principle of complete 
reproducibility, R scripts and raw data can be readily downloaded from Bracks 
(2023). 


2.1 Investigating the role of prosodic prominence through 
a Rapid Prosody Transcription experiment 


As a first step towards understanding the prosody of Totoli, I conducted a Rapid 
Prosody Transcription (RPT) experiment to gain preliminary insights into the 
role of prosodic prominence in the language’s intonation. The RPT method is a 
simple and relatively quick tool that captures listeners’ perception of boundaries 
and prominences (Mo et al. 2008, Cole et al. 2014). A description of the method 
is given by Cole & Shattuck-Hufnagel (2016: 11). 


[It] draws on linguistic theories of prosody (or intonation) in recognizing 
prominence and phrasing as two separate dimensions of prosodic form, 
and as such RPT can be used within any theoretical framework that rec- 
ognizes prominence and phrasing, as a means of tapping into ordinary 
listeners’ subjective impression of prominences and boundaries in speech. 


2 Experimental approaches 


In an RPT experiment, speakers are presented with speech samples and a tran- 
scription thereof and are asked to identify perceived prominences and bound- 
aries, based on their auditory impression of the recording. The task does not 
require any experience in prosodic transcription or linguistic knowledge. The 
RPT method has been employed in a number of studies on well-researched lan- 
guages such as English (Mo et al. 2008, Cole et al. 2010), German (Riesberg et al. 
2020), Estonian (Ots & Asu 2019), and Korean (You 2012). 

Crucial in the choice of RPT as experimental approach towards the prosody of 
Totoli taken here is Cole & Shattuck-Hufnagel’s note that it is particularly use- 
ful for “populations not easily accessed from the university communities where 
most prosody researchers reside [and it] opens the door to obtaining prosody 
judgments from minority linguistic communities, from elderly people and those 
in rural communities, and from communities of language learners” (2016: 12). 
Riesberg et al. (2018, 2020) followed up on this suggestion and successfully em- 
ployed the RPT method in a study on Papuan Malay. 

In this light, RPT provides a suitable setup for this investigation here. 


2.1.1 Materials 


For the experiment, speech samples were taken from recordings of Pear Story 
(Chafe 1980) retellings from the Totoli language corpus (see §3.1). Five different 
speakers were selected based on the quality of their retelling in terms of smooth 
speaking flow and naturalness. A total of 71 speech samples were taken from the 
recordings, each ranging between 1.37 and 6.73 seconds in length and comprising 
between one and three CIUs. 

Table 2.1 gives an overview of the number of CIUs the speech samples contain. 


Table 2.1: Number of CIUs contained in the samples used in the RPT 


experiment 


Number of Number of 
speech samples CIUs in speech sample 


41 1 
26 2 
4 

total = 71 


Table 2.2 gives an overview of the number of words and the duration of sam- 
ples used in the RPT experiment. 
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Table 2.2: Duration (in seconds) and number of words in speech sam- 
ples 


min max mean median sd total 


number of words 4 13 6.83 6 2.32 484 
duration in seconds 1.37 6.73 2.78 2.68 1.03 197.26 


Further information about the speakers of the stimuli is given in §A.1.1 of the 
Appendix. 

The speech samples were presented without any punctuation and used the 
local orthography. An example is discussed below (see example (1) and its real- 
ization in Figure 2.2). 


2.1.2 Participants 


Twenty native Totoli speakers were recruited for the experiment: 12 male and 8 
female — Mage = 30.05; Range age = 18-45. Participants were required to be flu- 
ent in Totoli and possess good computer skills. All participants reported being 
born and raised in the Tolitoli regency (Kabupaten Tolitoli) and raised with Totoli 
as their first language. Additionally, they are also fluent speakers of the spoken 
variety of Indonesian/Malay in the region, and to varying degrees, Standard In- 
donesian. Further information regarding the participants can be found in §A.1.2 
of the Appendix. 

Totoli is an endangered language and, as such, the recruitment of participants 
is challenging. Consequently, all 20 participants in the experiment were asked 
to perform both boundary and prominence judgments. To control for potential 
task order effects, I followed the approach of Mo et al. (2008) and divided the par- 
ticipants into two groups. The first group rated prominences before boundaries, 
while the second group completed the tasks in the reverse order. 


2.1.3 Procedure 


In an RPT experiment, participants are presented with speech samples along 
with a transcription of the recording and are asked to identify perceived promi- 
nences and boundaries based on their auditory impressions. It is noteworthy that 
the task does not require any experience in prosodic transcription or linguistic 
knowledge. 
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The stimuli were presented via the LMEDS web interface (Cole et al. 2017, 
Mahrt 2016: 206). Since Indonesian is the national language and the medium 
of formal education, the instructions were given in Indonesian, as participants 
would find it highly unnatural to receive instructions in Totoli. To maintain con- 
sistency and comparability, the instructions and examples were taken from Ries- 
berg et al. (2018: 409-411) and reprinted in §A.1.3 and §A.1.4 of the Appendix. 

Boundaries were briefly explained to participants as a tool employed by speak- 
ers to chunk some words together or separate others (see §A.1.3 of the Appendix). 
An example of grouped numbers in a long telephone number was given: 


229 100 2999 
A second example was given which was equivalent to:* 
‘Teat, Father” vs. “Teat father.” 


The concept of prominence, on the other hand, has no exact equivalent in In- 
donesian (compare also Cole & Shattuck-Hufnagel 2016: 29). Riesberg et al. (2018: 
409) describe prominences as a way in which speakers make some words stand 
out (Indonesian: menonjol ‘to stand out’) and state that this can usually be heard 
or felt by the listener. The exact wording and a translation to English is reprinted 
in §A.1.4 of the Appendix. 

Two Indonesian examples were presented to the participants; their English 
translations are reprinted here: 


1) She sees a cow 


2) She sees a cow and a horse eating grass 


In the LMEDS web interface, speakers click on a word: in the boundary-marking 
task, a vertical bar (|) appears after a selected word to indicate a boundary; in 
the prominence-marking task, the selected word appears in bold. Participants 
listened to the audio exactly twice. Selection of words (i.e. placement of bound- 
aries or prominences) was permitted only after participants had listened to the 
speech sample at least once. No time constraint was given for marking either the 
boundaries or prominences for a respective speech sample. Participants were told 
explicitly that they were free to mark as many or as few boundaries or prominent 
words as they wanted. They were also told that they could change their minds, 


'The Indonesian example given was: Bapak saya sudah datang. “My father is already at home? 
vs. Bapak, saya sudah datang. “Father, I am already home}; cf. Riesberg et al. (2018: 409-412). 
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selecting and deselecting words freely before continuing to the next speech sam- 
ple. 

The prominence- and boundary-rating tasks are illustrated in example (1), taken 
from the speech samples used in the experiment. The glossing and translation 
is included here for the reader only and boundary and prominence marking is 
arbitrarily chosen for the illustration of the task. In the experiments, participants 
were presented with the transcript — ie. the first line — only. 


(1) isakema ulay dei sapeda manana ia nallakym» 
isakemo ulan | dei sapeda | manana ia nallakomo 


ni-sake-0-m2 ulan dei sapeda maņanaia no-RDP-lak9=mo 
RLS-put.up-UV=cPL again Loc bike child PRX AV.RLS-RDP-walk=CPL 


‘(after he) puts it on his bicycle again, the child walks off 


(pearstory 9 FAH.039-40) > 


Before the Totoli data was presented to the participants, they completed a 
training run with four Indonesian speech samples taken from Pear Story retellings. 
Participants had no prior experience in participating in an experiment and the 
trial runs in a language they are most familiar with as written medium was 
deemed necessary so that they could get accustomed to the task. Riesberg et 
al. (2020) showed that participants are very sensitive to language specific cues in 
the marking of prominences, even in languages they are not familiar with. Based 
on these results, I do not expect any influence of the trial runs in Indonesian, 
although a potential influence on the overall result cannot be excluded. 


2.1.4 Analysis 


Participants rated 71 speech samples. Boundaries placed after the last word of 
a given speech sample were discarded, as no judgment was needed there. Fol- 
lowing Cole et al. (2010: 304), I calculated boundary scores (b-scores) and promi- 
nence scores (p-scores) for each word, representing the proportion of speakers 
who marked the respective word as prominent or as preceding a boundary. 

In Figure 2.1, the results of both tasks are illustrated for the speech sample 
presented above in example (1). The speech sample consists of two CIUs. 

Figure 2.1 shows that most speakers perceived a boundary following sapeda 
‘bicycle’ (b-score = 90), which coincides with the location of a CIU boundary 
determined by my analysis (see §1.4). Similarly noticeable are the relatively high 
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[isakemo ulan dei sapeda] [manana ia nallakomo.] 


Figure 2.1: p-scores and b-scores for example (1), squared brackets in- 
dicate CIU boundaries. 


p-scores for the first word of the first CIU, isakemo ‘put up’ (p-score = 45), and 
the last word of the second CIU, nallaksm» ‘to go’ (p-score = 70). 

Figure 2.2 shows the periogram with pitch track in semitones (st) of example 
(1). The pitch contour is given in yellow. The blue line in the background gives 
information on periodic energy, represented by the width and transparency of 
the line. The prominence ratings are indicated in red and the boundary ratings 
are indicated in black. 
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Figure 2.2: Periogram with pitch track (in st) for example (1): order of 
tiers from top to bottom is word, p-score, b-score; speaker FAH. 


Inspection of the pitch contour given in Figure 2.2 and its comparison with the 


respective boundary and prominence ratings in Figure 2.1 shows that boundary 
ratings, prominence ratings and pitch rises appear to largely coincide. 
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To analyze the degree of agreement between participants, I calculated Fleiss’ 
kappa and Cohen’s kappa (x) coefficients (Fleiss 1971, Cohen 1960). Cohen’s x 
measures the agreement of judgments between two participants over all rated 
items, thus providing (n? — n)/2 values for n participants. Fleiss’ k is a measure 
that provides a single figure indicating the overall agreement among all partic- 
ipants. Kappa values range from 0 to (—)1. A value of 0 indicates agreement at 
chance level and a positive value indicates agreement above chance level (Cramer 
& Howitt 2004: 83). 

First, I discuss the Fleiss’ kappa and Cohen’s kappa for Totoli before relating 
the results more meaningfully to those reported for other studies. 


2.1.4.1 Fleiss’ kappa coefficients 


Table 2.3 shows the Fleiss’ k coefficients for the boundary-rating task and the 
prominence-rating task. It provides values for all raters together, as well as sep- 
arately for the group that rated boundaries first and then prominences and vice 
versa. 


Table 2.3: Fleiss’ kappa: the difference between rated subjects equals 
the 71 discarded words in stimulus-final position for boundary ratings. 


prominences all boundaries first prominences first 
Stimuli = 484 484 484 
Raters = 20 10 10 
Kappa = 0.143 0.138 0.165 
boundaries all boundaries first prominences first 
Stimuli = 413 413 413 
Raters = 20 10 10 
Kappa = 0.485 0.497 0.506 


Here, I interpret the kappa values only in relation to each other. Comparing 
the values between the two tasks gives information about the extent to which the 
raters agree on the placement of prominences in comparison to the placement of 
boundaries. The kappa values in Table 2.3 are substantially lower for judgments 
of prominence placement (x = 0.143) than for judgments of boundary placement 
(k = 0.485). 

The comparison of kappa values between the two groups, i.e. those who rated 
boundaries first and then prominences and vice versa, provides information about 
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the influence of the task order. The comparison shows that, similar to the find- 
ings of Mo et al. (2008: 736), the influence of order of tasks does not have a strong 
effect (prominence placement: k = 0.138 and 0.165; boundary placement: 0.497 
and 0.506). 

In sum, the participants agreed substantially more on the placement of bound- 
aries than they did on the placement of prominences. The task order, however, 
had only a marginal effect on the kappa values. 

To further analyze the degree of agreement between individual pairs of speak- 
ers, Cohen’s kappa coefficients were calculated. 


2.1.4.2 Cohen’s kappa coefficients 


The distribution of Cohen’s kappa coefficients for pairwise interrater agreement 
between speakers are shown below in the violin plot in Figure 2.3. 


boundaries prominences 
0.9 


os p "es 


Cohen's kappa coefficients 


0.0 


Figure 2.3: Distribution of Cohen’s kappa coefficients of 190 rater pairs: 
mean boundaries = 0.49, median boundaries = 0.52; mean prominences 
= 0.15, median prominences = 0.14 


Figure 2.3 shows that the pairwise agreement on boundaries is substantially 
higher than agreement for prominences. Landis & Koch (1977: 165) propose agree- 
ment bins for the classification of pairwise interrater agreement values. Figure 2.4 
shows the frequency distribution across agreement bins: <0.00 = poor; 0.00-0.20 
= slight; 0.21-0.40 = fair; 0.41-0.60 = moderate; 0.61-0.8 = substantial; 0.81-1.00 
= almost perfect. 

In the case of boundaries, no pairs are found in the poor, 6.84% in the slight and 
16.32% in the fair categories, with the majority being in the category moderate 
(57.37%). Some are even in the category substantial (19.47%). As for prominences, 
79.48% of all pairs are found in the categories poor and slight. Moderate is only 
attested for three pairs (1.58%) and non fall into the category substantial or almost 


perfect. 
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Poor Slight Fair Moderate Substantial Almost Poor Slight Fair Moderate Substantial Almost 
Perfect Perfect 


Figure 2.4: Frequency distribution of Cohen’s kappa coefficients in 
agreement bins according to Landis & Koch (1977: 165): total numbers 
are indicated in brackets. 


2.1.5 Discussion 


The Fleiss’ k and Cohen’s x coefficients above show that participants generally 
agree only very little on the judgment of which word in a given speech sample is 
prominent. They agree considerably better on the placement of boundaries. This 
is especially evident when considering the pairwise interrater agreements of the 
Cohen’s x. 

As measurements of interrater agreement, Cohen’s x and Fleiss’ x have been 
used in a number of RPT studies on different languages, providing a growing 
body of literature for comparison. Figures 2.5 and 2.6 compare the results ob- 
tained for Totoli with those reported by other studies. There are more studies 
that report on values for agreements on prominence ratings than on boundary 
ratings. Note, however, that the different studies vary with regard to the speech 
samples used, which may limit their comparability. 

In the studies of American English (Mo et al. 2008, Cole et al. 2010), excerpts 
from the Buckeye Corpus of spontaneous conversations (Pitt et al. 2007) were 
used for the experiment. Similarly, the study of Papuan Malay (Riesberg et al. 
2018, 2020) used samples of spontaneous speech, obtained from recordings of 
the Pear Film (Chafe 1980) and the Tangram Task. The study on Estonian (Ots & 
Asu 2019) used excerpts of the Phonetic Corpus of Estonian Spontaneous Speech. 
You (2012) used answers obtained from two native speakers replying to a set of 
questions in Korean. Lastly, data on German are reported by Baumann & Winter 
(2018), who used sentences read out loud for the RPT experiment. The study 
of Totoli is most comparable to that on Papuan Malay by Riesberg et al. (2020) 
regarding the experimental setup and material used. 
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Figure 2.5 shows the Fleiss’ k coefficients of the boundary-rating task reported 
in the studies mentioned above, comparing these with the results obtained from 
Totoli. 


0.621 


np 0.612 0.63 
0.6 
0.575 


0.544 
4 0.485 


Fleiss' kappa coefficients 


American Korean Papuan Totol 


Figure 2.5: Fleiss’ k for boundary ratings using the RPT method: dots 
indicate single reported values, lines indicate reported range. 


Table 2.6 shows the Fleiss’ x coefficients reported in various studies for the 
prominence-rating task. 
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0.394 3 0.39 


0.373 0.36 


Fleiss' kappa coefficients 
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0.103 
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Figure 2.6: Fleiss’ x for prominence ratings using the RPT method: dots 
indicate single reported values, lines indicate reported range. 


Across the languages, Fleiss’ k coefficients vary substantially for prominence 
ratings. The highest results for agreement were found for German with a Fleiss’ 
k coefficient of 0.53 (Baumann & Winter 2018). The lowest results were reported 
for Papuan Malay (Riesberg et al. 2018, 2020) with a Fleiss’ x coefficient of only 
0.103. The results obtained for Totoli (e = 0.143) are similar to that of Papuan 
Malay, although slightly higher. 

In sum, the comparison of prominence and boundary ratings across the vari- 
ous studies on various languages shows a higher degree of agreement for bound- 
ary rating than for prominence rating. With regard to boundary rating, the re- 
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sults for Totoli are similar to those reported by other studies (see Figure 2.5). 
With regard to prominences, however, Totoli listeners show comparatively low 
agreement values that are most similar to those reported for Papuan Malay (see 
Table 2.6). The results point to the fact that prosodic prominence may not be a 
relevant category in the prosodic system of Totoli. 

This is in fact similar to what Riesberg et al. (2018) report for Papuan Malay, 
where participants show comparatively low agreement on prominence place- 
ment and appear to perceive prominences mainly at boundaries. Riesberg et al. 
(2020: 2) caution that the results obtained from an RPT study, such as the above, 
“cannot establish ‘facts’ of the type ‘language X makes use of pitch accents’ or 
‘speakers of language X hear durational differences as marking lexical stress.” 
Therefore, I conducted two further experiments, reported in §2.2 that examine 
the assumption about a possible lack of prosodic prominence marking in Totoli. 


2.2 Investigating the role of prosodic prominence through 
a focus marking experiment 


In order to further investigate the role of prosodic prominence in Totoli, I con- 
ducted a production and perception experiment which examines whether focus 
is acoustically prominent in the marking of information-structural categories, 
such as focus. 

Studies on West Germanic languages show a fine-grained distinction between 
various focus structures, with the most pronounced difference in production be- 
tween background and narrow focus (Miicke & Grice 2014, Baumann et al. 2006, 
Swerts et al. 2002, Kaland et al. 2023, Lee et al. 2015, Kember et al. 2021). Thus, 
if prosodic prominence were to play a role in the marking of focus in Totoli, I 
expect it to be observable in constructions such as those in Figure 2.7. 

Most European languages make use of postlexical pitch accents as a means to 
express information-structural categories (see among others Gussenhoven 1984, 
Ladd 2008, Jun 2014a, Bolinger 1986, Halliday 1967, Grice et al. 2020, Wagner 
2012). Thus far, little is known about the role of prosodic prominence in the 
marking of information-structural categories in Austronesian languages. Sum- 
marizing what is known so far about Austronesian languages of Indonesia, Him- 
melmann (2018: 347) comments that “it seems likely that prosodic prominence 
does not have a major role to play in marking information-structural categories”. 

Inasmuch as is known from typologically diverse languages, phrasing may be 
an alternative strategy to mark focus: 
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Question: Question: 


isei noninum 292? 
WHO AV.RLS-drink water 


inar noninum sopa? 
mother Av.RLS-drink WHAT 


‘Who drinks the water?’ ‘What does the mother drink?’ 


Answer: 


inay noyinum 29! 
mother Av.RLS-drink water 


“The mother drinks water: 


Figure 2.7: Example of question-answer pairs with a syntactically iden- 
tical answer 


[..] the function of postlexical pitch accent in English and other Germanic 
languages (such as marking focus or disambiguating an ambiguous string) 
is performed by placing words in the same or different prosodic units, i.e. 
prosodic phrasing in Japanese and Korean. (Jun 2005b: 414) 


As exemplified in Figure 2.7 above, Totoli allows a syntactically identical clause 
as an answer to different wh-questions that trigger narrow focus on either the 
subject or the object. This provides a suitable testing ground for the investigation 
of focus marking in Totoli and the role of prosodic prominence thereto. 

Here, I present an experiment that examines the role of prosodic prominence 
in Totoli by searching for prosodic cues in the marking of information-structural 
categories. That is, I investigate whether the information structure category of 
focus is acoustically prominent in identical constructions that were uttered as 
answers to questions triggering either subject or object focus. 


2.2.1 Materials 


I recorded a set of question-answer (QA) pairs of different focus types (explicated 
below), taken from Skopeteas et al. (2006: 206-220). I selected those of narrow 
focus type for further analysis, including the examples in Figure 2.7. Different 
types of narrow focus have been identified; the QA pairs under discussion here 
correspond to what has been called “information focus” (Krifka 2008). 
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I recorded the QA pairs with 6 different speakers (2 females and 4 males): Mage 
= 31.15; RangeAge = 26-61. Further information on the recorded speakers is given 
in §A.2.1 of the Appendix. 

The QA pairs were presented to the speakers one by one in a PowerPoint 
presentation. The recruited speakers had already been recorded beforehand dur- 
ing the Totoli documentation project and, therefore, were comfortable with the 
recording setting. All recordings were done in cooperation with Datra Hassan 
(DT; see Appendix §A.2.1), native speaker of Totoli and member of the Totoli 
documentation team. He uttered the questions and the other speakers spoke the 
answers. The set of QA pairs was recorded twice with each speaker. In the first 
round, the speakers could familiarize themselves with the task and the different 
QA pairs. The recordings of this round were not used for the analysis here. Be- 
fore the second round, the speakers were instructed to listen attentively to the 
question to which they were answering. This was done to ensure that, when 
uttering the answers, speakers were fully aware of the foci triggered by the ques- 
tions. Speakers were allowed to immediately repeat a QA pair if they judged their 
production to be unnatural or erroneous. 

Both Datra and the speakers were wearing head-mounted AKG C520 con- 
denser microphones, attached to a Zoom Q8 audio/video recorder. Recordings 
were done at a sampling rate of 44.1 kHz in a 16-bit mono format. 

The nine selected QA pairs involved four different answer clauses. Three of 
the answer clauses were transitive constructions of the structure subject-verb- 
object (SVO) and one of the answer clauses was a ditransitive construction of the 
structure subject-verb-object-indirect object (SVOO;). Each of the three transitive 
clauses occurred twice: once as the answer to a wh-question triggering narrow 
focus on the subject and once as the answer with narrow focus on the object. The 
ditransitive clause occurred three times: as the answer to wh-questions trigger- 
ing narrow focus on the subject, on the object, and on the indirect object. 

For a full account of the set of QA pairs, see §A.2.3 of the Appendix. 

In the first step (§2.2.2), I analyzed whether the constituents differ acoustically 
with regard to focus condition. In the second step (§2.2.3), I conducted a percep- 
tion experiment with the goal of investigating whether listeners perceptually 
distinguish answer pairs such as those in Figure 2.7. The results of this analysis 
are explicated below. 


2.2.2 Acoustic analysis 


Various prosodic cues have been correlated to prosodic prominence (see among 
others Baumann et al. 2006, Maskikit-Essed & Gussenhoven 2016, Lee et al. 2015, 
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Table 2.4: Type and number of constructions: in-focus constituents are 
indicated in uppercase letters and colored in blue, constituents that are 
not in focus are indicated in lowercase letters and colored in yellow. 


NUMBER 
TYPE WH-QUESTION ANSWER TYPE 
OF CONSTRUCTIONS 
P iti subject focus Svo 4 
ransitive : 2 
object focus svO 
subject focus Svoo; 
ditransitive object focus svOo; 1x 
indirect object focus svoOj 
(What is the mother drinking? | Who is drinking the water? | 


Speaker A: | Speaker A: 


| The mother is drinking WATER! | THE MOTHER is drinking water | 


Figure 2.8: Example of pairwise analysis of answer pairs: in-focus con- 
stituents are indicated in uppercase letters and colored in blue; con- 
stituents that are not in focus are indicated in lowercase letters and 
colored in yellow. 


Arnhold & Kyrolainen 2017). For the analysis here, I investigate duration, pitch, 
and intensity. 

To do so, I first present and discuss data obtained from one randomly selected 
speaker. Following this, I present and discuss data on the two constituents aga 
‘water’ and inan ‘mother’ as these occur in two different answer clauses — once 
in clause-initial subject position and once in clause-final object position — and 
in the two different focus conditions. The two clauses are given in examples (2) 


and (3). 


(2) mon naninum aga 
inay = noN-inum 29 
mother Av.RLS-drink water 


‘The mother is drinking water: 
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(3) 299 niinum inan 
292 ni-inum inan 
water UV.RLS-drink mother 
‘The mother is drinking water: 
(approx. “The water is being drunk by the mother.) 


The two constructions are a good illustration of the influence of the focus 
condition on the realization of the constituents in different clause positions, i.e. 
initial position and final position. The discussion of phonetic parameters is sup- 
ported by a statistical analysis. 

Iran mixed effects models with the respective parameter as dependent variable 
and focus as independent variable, using the Ime4 package (Bates et al. 2015) in R 
software (Team 2017). I included random effects for the speakers, position of the 
segment and the segments (random intercepts and random slopes). Furthermore, 
I included the valency of the constructions as a control variable because of the 
unequal number of observations that I obtained (three recordings for ditransitive 
constructions and two for transitive constructions). 


2.2.2.1 Pitch and focus 


To investigate the effect of focus condition on pitch contour, I measured FO values 
for each constituent of the answer clauses in 30 time steps using Praat software 
(Boersma & Weenink 2023). It should be noted that a constituent may consist 
of a single word, such as deuk ‘dog’, or multiple words, such as manana dolago 
‘the girl’. In a subsequent step, I transformed FO values to semitones using the 
HzToSemitones command of the Soundgen package (Anikin 2019) in R software 
(Team 2017), with the frequency of the reference value set to 1. 

Consider first the pitch contours in Figure 2.9. It displays pitch contours for 
nine different target constituents produced by a randomly selected speaker (speaker 
SRN; see Table A.3 of the Appendix). Each pitch contour is labeled according to 
its focus condition (indicated by color) and grammatical role (indicated by a cap- 
ital letter). For instance, svO denotes a constituent that serves as the object of 
a transitive clause, while svoO denotes a constituent that serves as the indirect 
object of a ditransitive clause. 

Constituents in clause-initial and clause-medial position show very similar 
pitch contours in both focus conditions. For example, the pitch contours of buuk 
(svOo) ‘book’, manana dolago (Svoo) ‘girl’, 203 (Svo) “water”, and deuk (Svo) ‘dog’ 
demonstrate no distinguishable influence of the focus condition on their tonal 
realization, including both the shape of the pitch contours and the pitch range. 
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deuk 'dog' (Svo) inan 'mother' (Svo) manana dolago 'girl' (Svoo) 


ogo 'water' (Svo) buuk 'book' (svOo) dei inanna 'to the mother’ (svO) 


ZS 95.0 / 

. 92.5 — infocus 

Ei = A 

c 90.0 not in focus 


5 87.5 
a 
inan 'mother (svO) ogo ‘water’ (svO) sesen Cat (svO) 
97.5 
95.0 
92.5 ms 
KN 
oe LA 
87.5 KOAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAZ 


time (normalized) 


Figure 2.9: Pitch in st (Ref = 1 Hz) for the target constituents of one ran- 
domly selected speaker (SRN): focus condition is indicated by color, for 
ditransitive constructions, recordings have two elements in the non- 
focus condition, and one in focus condition, for transitive construc- 
tions, each focus condition has one recording; position and clause struc- 
ture are indicated in brackets, time scale is normalized, phrase position 
is indicated above the target constituents, position is indicated above 
the target constituents, position and clause structure is indicated in 
brackets. 


Noticeably greater variation is attested for elements in final position, such as 
dei inanna (svoO) ‘to the mother’, sesen (svO) ‘cat’, ago (svO) ‘water’, and inan 
(svO) ‘mother’. These elements exhibit more diverse pitch contours, indicating 
that their tonal realization is more sensitive to the focus condition. 

The latter two occur in both clause-final position (svO) and clause-initial po- 
sition (Svo), cf. example (2) and (3) and in both focus conditions (cf. Figure 2.7). 

Figure 2.10 shows these two constituents as produced by all speakers. The 
left-hand column shows their realizations in clause-initial position and the right- 
hand column shows their realizations in clause-final position. The focus condi- 
tion is indicated by color. 

Once again, the pitch contours of the two constituents in clause-initial posi- 
tion exhibit very similar shapes across the two focus conditions. However, the 
constituents 203 ‘water’ and inan ‘mother’ in initial position display variations 
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inan 'mother' (Svo) inan 'mother' (svO) 


— infocus 
~~ not in focus 


pitch in st (Ref = 1Hz) 


time (normalized) 


Figure 2.10: Pitch in st (Ref = 1 Hz) of the two constituents inan ‘mother’ 
and 203 ‘water’ in the two focus conditions for all speakers: focus con- 
dition is indicated by color, position and clause structure is indicated 
in brackets, time scale is normalized. 


in the timing of the pitch rise. Specifically, the constituent 203 ‘water’ exhibits 
an initial fall or level tone, a steep rise, and a final pitch plateau, while the con- 
stituent inan ‘mother’ shows a relatively continuous rise. Further investigation is 
required to determine how the segmental material of the two constituents influ- 
ences the different pitch contours in initial position. Furthermore, constituents 
in final position, in the right-hand column, exhibit much greater variability. No- 
tably, a final rising boundary tone and a final rise-fall boundary tone are clearly 
visible (see §3.2.1). The different boundary tones may potentially be correlated 
with focus marking and could serve as a cue to indicate the focus condition of 
clause-final constituents. Table 2.5 correlates the IU-final boundary tones used 
with each focus condition. 

An inspection of the boundary tone of the 54 CIUs revealed that out of 30 in- 
stances of a constituent in clause-final position not in focus, 12 were produced 
with a final rise-fall contour, while 18 had a final rise contour. For constituents 
in final position and in focus, out of 24 instances, 10 had a final rise-fall con- 
tour, and 14 had a final rise contour. Due to the small sample size, a correlation 
between final pitch movement and focus condition cannot be conclusively estab- 
lished. However, while it cannot be completely ruled out, if focus were indeed 
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Table 2.5: IU-final boundary tones per focus condition (n=54 con- 
stituents in final position) 


final rise-fall final rise total 


not in focus 12 18 30 
in focus 10 14 24 
22 32 =54 


expressed by the CIU-final boundary complex, a stronger correlation would be 
expected. Hence, the findings suggest that the choice of final boundary tone does 
not indicate the presence or absence of focus. 

Inasmuch as focus can be expressed by prosodic phrasing in the constructions 
used here, it involves variation in the chunking of Compound IUs into embedded 
IUs (cf. 83.2). In general, a subject in preverbal position is chunked into its own 
IU, and variation pertains mainly to chunking the verb and a following object 
NP into either one IU or two separate IUs, i.e. (ogoJru Iniinum inan Jru vs. (dgolru 
[niinum]¡y [inay Jru- These two possible realizations are shown in Figure 2.11 and 
Figure 2.12, respectively. 

In both realizations the subject 203 ‘water’ is chunked into its own IU, clearly 
visible by the pitch rise on the last syllable .go. In Figure 2.11, the verb niinum 
‘is being drunk’ and the object NP inan ‘mother’ are chunked into one IU. In 
Figure 2.12, however, the verb occurs in a separate IU, indicated by the pitch rise 
on the final syllable .num. 

Example (3) and its two realizations in Figure 2.11 and Figure 2.12 indicate that 
speakers have a certain freedom with regard to chunking. Of 54 clauses, however, 
the realization in Figure 2.12 is the only instance in which a speaker chunks the 
verb into a separate IU. All other instances lump the verb in an IU together with 
the following object NP, as in Figure 2.11. Hence, in the recorded target clauses, 
focus is not expressed through phrasing or chunking of constituents into IUs. 

To statistically test the influence of the focus condition on pitch, I calculated 
the pitch minimum, pitch maximum, and pitch range as the difference between 
the maximum and minimum pitch values for each target constituent (subject, 
direct object, indirect object) in all the target clauses. Mean FO values were mea- 
sured for each labeled vowel in the respective constituents to avoid octave jumps. 
FO values were then converted to semitones to control for speaker-dependent vo- 
cal range. 

The outcome of the mixed effects model (see §2.2.2) is given in Table 2.6. 
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Figure 2.11: Periogram with pitch track (in st) for example (3) with nar- 
row focus on the final constituent inan ‘mother’: speaker IFS > 
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Figure 2.12: Periogram with pitch track (in st) for example (3) with nar- 
row focus on the initial constituent 5g» ‘water’: speaker IFS > 


Table 2.6: Results of the mixed effects model with focus as indepen- 
dent variable and pitch range, pitch minimum and pitch maximum as 
dependent variables 


Estimate SE t Pr(> lt) 
range 0.313 0.39 —0.793 0.442 
max 0.028 0.830 0.034 0.976 
min 0.657 0.942 0.697 0.555 
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The results of the model in Table 2.6 show that there is no statistically sig- 
nificant effect of focus condition on pitch range, pitch maximum, nor on pitch 
minimum. Furthermore, the results indicate that the target constituents are pro- 
duced with a reduced pitch range (—0.313 st), and higher pitch maximum (+0.028 
st) and minimum (+0.657 st) when in focus. The effects are negligible, consider- 
ing that the just noticeable difference (j.n.d.) is estimated to be around 1.5-2 st 
(t Hart 1981, 't Hart et al. 1990: 29). 

I conclude that the focus condition has no discernible impact on the pitch 
contour, the pitch minimum and maximum, and pitch range. 


2.2.2.2 Duration and focus 


To discover any potential effect of focus on duration, I measured the duration 
from the onset until the end of each labeled constituent. 

Similar to the above, I plotted the duration of the nine different constituents 
in both focus conditions produced by one randomly selected speaker (speaker 
ZHRM; cf. Table A.3 of the Appendix). These are given in Figure 2.13. No apparent 
effect of focus condition on duration is discernible. 


initial 
8 


d = A 2 A in focus 


© not in focus 


duration in seconds 


gir!’ 


manana dolago 


target constituents 


Figure 2.13: Duration (in seconds) and focus condition of the target 
words for one randomly selected speaker (speaker ZHRM): for ditran- 
sitive constructions, two elements in the non-focus condition, and one 
in focus condition were recorded; for transitive constructions, each fo- 
cus condition has one recording; position and clause structure are in- 
dicated in brackets. 


The constituents buuk (svOo) ‘book’, manana dolago (Svoo)'the girl’ and inan 


(Svo) ‘mother’ show very similar values for both focus conditions. With regard to 
the constituent 990 (Svo/svO) ‘water’, duration is longer when in focus, especially 
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when in object position (svO). On the other hand, sesen (svO) ‘cat’ shows shorter 
duration when in focus. 

To get a better picture of any systematicity across speakers, I plotted durational 
values for the two words (non ‘mother’ and 2g» ‘water’ produced by all speakers 
in Figure 2.14. 
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Figure 2.14: Duration (in seconds) and focus condition of the two con- 
stituents inan ‘mother’ and 203 ‘water’ in the two focus conditions for 
all speakers; position and clause structure are indicated in brackets. 


Figure 2.14 yields results similar to those obtained from the inspection of pitch 
contours in Figure 2.10. There appears to be a substantial effect of clause posi- 
tion. Elements in clause-final position are longer than elements in clause-initial 
position. This is indeed expected (cf. utterance-final lengthening/preboundary 
lengthening; Turk & Shattuck-Hufnagel 2020, Byrd et al. 2006, Berkovits 1993). 
However, the focus condition does not yield any apparent effect. The results of 
the mixed effects model (see §2.2.2) are given in Table 2.7. 


Table 2.7: Estimate of the group difference of the mixed effects model 
with focus as independent variable and duration as dependent variable 


Estimate SE t Pr(> It) 


0.009 0.014 0.646 0.523 
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The results of the mixed effects model show that the effect of focus condition 
on duration is not significant (p = 0.523). 


2.2.2.3 Intensity and focus 


Lastly, I investigated whether there is any potential effect of the focus condition 
on intensity. I measured the mean intensity in decibels (dB) of each labeled con- 
stituent from the onset until the end of the constituent. As explained above, all 
recordings were done with head-mounted microphones in order to keep the dis- 
tance from the microphone to the mouth constant (see §2.2.1). This is a necessary 
prerequisite for taking into account intensity. No normalization of data is needed 
as I am visually comparing individual speaker variation and the statistical model 
includes random effects for speakers. 

Figure 2.15 shows the intensity values of the nine different constituents in both 
focus conditions as produced by one randomly selected speaker (speaker SP; cf. 
Table A.3 of the Appendix). 


a, A o 
E A A 
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5 ` d A in focus 
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ER 
è : o 
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target constituents 


Figure 2.15: Intensity (in dB) and focus condition of the target words 
by one randomly selected speaker (SP): for ditransitive constructions, 
two elements of the non-focus condition, and one in-focus condition 
were recorded; for transitive constructions, each focus condition has 
one recording; position and clause structure are indicated in brackets. 


Figure 2.15 shows that this speaker tends to produce the constituents in focus 
with a slightly lower mean intensity, with the exception of deuk (Svo) dog" and 
aga (svO) “water”. 

To see whether this trend can be observed for other speakers as well, I plotted 
mean intensity values for the two constituents inan ‘mother’ and 203 ‘water’ of 
all speakers in Figure 2.16. 

Figure 2.16 shows that the constituent 203 (svO) ‘water’ appears to generally 
be uttered with a slightly higher mean intensity when in focus. In initial position, 
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A in focus 


o not in focus 


mean intensity (in dB) 


Figure 2.16: Intensity (in dB) for the two constituents inan ‘mother’ and 


aga ‘water’ in the two focus conditions; position and clause structure 
are indicated in brackets. 


mean intensity tends to be lower when in focus (with one exception). No such 
trend is found for the constituent inan ‘mother’. 

Again, also with regard to mean intensity, no clear effect of the focus condition 
is discernible. The results of the mixed effects model (see §2.2.2) are given in 
Table 2.8. 


Table 2.8: Results of the mixed effects model with focus as independent 
variable and intensity as dependent variable 
Estimate SE t Pr(> lt!) 
0.601 0.909 0.661 0.518 


The results of the mixed effects model show that there is no statistically sig- 
nificant effect of focus on intensity. 


2.2.2.4 Summary 


The results of the acoustic analysis suggest that, in the set of clauses analyzed, 
focus is not prosodically encoded by the parameters tested, i.e. by pitch range, 
minimum and maximum, CIU-final boundary-tone complexes, phrasing, dura- 
tion or mean intensity. 
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However, the focus condition may be expressed by other means not tested here, 
for example spectral tilt, variations in the tonal realization of constituents, vowel 
quality etc. To exclude this possibility, I conducted a perception experiment to 
see whether native speakers distinguish between different focus conditions. 


2.2.3 Perception experiment 


In the perception experiment, participants listened to two QA pairs. For the ques- 
tion, the same recording was used in both. The answers, however, although syn- 
tactically identical, were previously recorded as answers to different wh-questions 
that trigger different foci. Hence, in one of the two QA pairs, the answer was the 
one originally uttered in response to that particular question. In the other QA 
pair, there was a mismatch (Figure 2.18). In a two-alternative forced-choice ex- 
periment, participants were asked to identify the correctly matched QA pair. I 
expected this to be a particularly easy task if focus is encoded in the answer 
clauses. 


2.2.3.1 Participants 


Twenty participants were recruited from the Totoli community, with the prereq- 
uisite that they were fluent speakers of the language and sufficiently knowledge- 
able with computers: Mage = 32.25; Range 4ge = 20-46. All speakers stated that 
they were born and raised in the Tolitoli regency (Kabupaten Tolitoli) and were 
raised with Totoli. As is the default in the area, they are also fluent speakers of 
the local variety of Indonesian/Malay and, to varying degrees, of Standard In- 
donesian. Further information about the participants is given in §A.2.2 of the 
Appendix. 


2.2.3.2 Procedure 


Participants listened to one correctly paired QA pair and one incorrectly paired 
OA pair. 

The experiment was run on a laptop, using the OpenSesame platform (Mathót 
et al. 2012). Participants were told that they would hear two OA pairs, of which 
one was correct and one was incorrect. Their task was to choose the QA pair 
they perceived as correct. They listened to the two QA pairs twice before making 
their choice. The task was repeated 72 times per participant (20 participants x 72 
choices = 1440). Stimuli were presented in random order and the visual order of 
the two choices was randomized as well. Figure 2.17 illustrates the experimental 
setup. 
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(What does the mother drink?! What does the mother drink?! 


Stimulus QA pair 1: Stimulus QA pair 2: - 
correctly paired falsely paired — (choice) 


| | 


The mother drinks water! The mother drinks water 


Figure 2.17: Illustration of experiment procedure 


2.2.3.3 Stimulus preparation 


I used two recordings of each transitive clause: one was previously uttered as 
the answer to the wh-question triggering focus on the subject, the other was 
uttered as the answer to the wh-question triggering focus on the object. For the 
ditransitive clauses, I had three recordings per speaker, each uttered in response 
to a different question that triggered focus on each of the three constituents. The 
constructions were summarized in Table 2.4 in §2.2.1. These constructions were 
recorded with 6 different speakers, yielding 56 recorded QA pairs of which the 
answers were cut out. 

For the experiment, I combined the same question with two different answers, 
yielding two QA pairs. 


1. One QA pair consisted of a question paired with an answer that had been 
previously recorded as the answer to the same question. For instance, a 
wh-guestion that triggered focus on the subject was paired with an answer 
that had been previously recorded in response to a wh-question that also 
triggered focus on the subject. This resulted in a correctly paired QA pair. 


2. The other QA pair was composed of the same question paired with an 
answer that had been previously recorded as the answer to a question trig- 
gering a different focus. For example, a wh-question that triggers focus on 
the subject was paired with an answer that had been previously recorded 
as the answer to a wh-question triggering focus on the object. This means 
that it was an incorrectly paired QA pair. 
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“What does the mother drink? | | Who drinks water? | 
| | 
Recorded OA pair Recorded OA pair 
e Y y 
| The mother drinks water! | | The mother drinks water! | 
Stimulus QA pair 1: Stimulus QA pair 2: 
correctly paired incorrectly paired 


\ ZL 


(What does the mother drink? 


Figure 2.18: Example of pairing of questions and answers 


The pairing of the question with the two answers is exemplified in Figure 2.18 

For the pairing of questions and answers in the perception experiment, all re- 
corded answers from the QA pairs of §2.2.1 were used. The pairing was automat- 
ically generated and the order was randomized using the OpenSesame platform 
(Mathôt et al. 2012). To control for speaker variation as a potential factor influenc- 
ing the choice of the correct QA pair, both choices of answers were taken from 
recordings of the same person. However, the recorded speaker of the question 
was always different from the one who recorded the answer to create a natural 
situation where the questioner and answerer are different individuals. 


2.2.3.4 Results 


The question at hand is whether participants exhibit a significant preference for 
the correctly paired QA pair over the incorrectly paired items (chance level = 
50%). If so, it can be assumed that the answer clauses carry prosodic cues that 
encode information about focus. In other words, if the prosody of Totoli encodes 
focus, then participants should show a preference for the correctly paired QA 
pairs. The distribution of question-answer assignments is depicted in Figure 2.19, 
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which reveals that 52% of participants selected the correctly paired QA pair while 
48% selected the incorrectly paired QA pair. 


100% 


50% 


al falsely paired 
El correctly paired 


QA-assignments 


25% 


ALL 


Figure 2.19: Distribution of question-answer assignments: horizontal 
line indicates chance level 50%, n = 1440. 


In order to determine the significance of the participants” tendency to choose 
the correct answer, I conducted a logistic mixed effects model with the choice 
of QA pair as the dependent variable. The model did not include any predictors. 
In this case, the intercept is the only parameter of interest, as it measures the 
probability of choosing the correct answer, with an intercept of 0 on the logit 
scale representing 50%. As random intercepts, I included the rater and recorded 
speaker, as well as the focus type (see Table 2.4)). I used glmer() with family = 
binomial of the Ime4 package (Bates et al. 2015) in R (Team 2017). 

glmer(), with family = binomial, 

The estimate of the intercept was very close to 0 (0.077), indicating only a slight 
preference for the correctly paired QA pair. This result was not significantly dif- 
ferent from 0 (p = 0.183; SE = 0.058; z = 1.332). Based on these findings, I 
concluded that the tendency to select the correctly paired QA pair was likely 
due to chance. 

To further investigate participant performance, I analyzed whether there were 
any observable effects. Specifically, I wanted to determine whether some partic- 
ipants performed differently than others, whether participant performance de- 
pended on the recorded speaker, and whether performance varied depending on 
the position and grammatical role of the constituent in focus. 
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2.2.3.4.1 Effect of rater 


First, I analyzed the distribution of the question-answer assignment across par- 
ticipants. The results are shown in Figure 2.20. 


100% 


Yo 

0% 
D 
5 
H 


Figure 2.20: Distribution of question-answer assignments by rater: hor- 
izontal line indicates chance level 50%. 


Sl 
a 
x 


El falsely paired 


E correctly paired 


QA-assignemnts 
3 


mM 
a 
x 


rater 1 
rater 2 
rater 3 
rater 4 
rater 5 
rater 6 
rater 7 
rater 8 


raters 


Figure 2.20 shows that the performance is comparable across participants. 
One participant (rater 1) showed a particularly high frequency of selecting the 
correctly paired QA pair (in 69% of all instances), while another participant (rater 


20) showed a higher preference for the incorrectly paired QA pair (in 39% of all 
instances). 


2.2.3.4.2 Effect of recorded speakers 


Second, I analyzed the distribution of the question-answer assignment according 
to the recorded speaker of the answers of the QA pairs. The results are plotted 
in Figure 2.21. Note that the QA pairs were paired such that the answer in both 
was uttered by the same speaker (cf. §2.2.3.3). 

Figure 2.21 shows that there is only slight variation in task performance de- 
pending on the recorded speaker. The differences appear to be negligible. 


2.2.3.4.3 Effect of position 


Thirdly, I investigated the distribution of the question-answer assignment accord- 
ing to the position in the clause of the constituent that is focused by the question. 
The results are plotted in Figure 2.22. 

Figure 2.22 shows that the position in the clause of the constituent that is 


focused by the question does not affect performance in choosing the correctly 
paired QA pair. 
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100% 
e 
0% falsely paired 
E correctly paired 
0% 


Speaker 1 Speaker 2 Speaker 3 Speaker 4 Speaker 5 Speaker 6 
recorded speakers 


QA-assignments 
DI ~ 
Q a 


N 
oO 


Figure 2.21: Distribution of question-answer assignments by recorded 
speaker: vertical horizontal line indicates chance level 50%. 
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75% 
e, falsely paired 
E] correctly paired 
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sentence position of target phrases 


QA-assignments 
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Figure 2.22: Distribution of question-answer assignments by position 
of the in-focus constituent in the clause: medial position means the 
direct object of ditransitive constructions (svOo), final position means 
the indirect object of ditransitive constructions and the object of the 
transitive constructions (svO/svoO); horizontal line indicates chance 
level 50%. 


2.2.3.4.4 Summary 


The data exploration examined three factors that may have influenced the question- 
answer assignment performance: rater, recorded speaker, and position in the 
clause of the constituent on which the question triggers focus. Despite some 
variability, particularly with individual raters, the analysis revealed that none of 
these factors had a significant impact on participants’ performance. 


2.2.4 Discussion 


In this section, I presented an experimental analysis of the interaction of prosody 
and focus in Totoli. To this end, I presented an experiment that examines the role 
of prosodic prominence in Totoli by searching for prosodic cues in the marking 
of information-structural categories. 
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The topic was first approached by an investigation that analyzed whether the 
information-structural category of focus is acoustically prominent in identical 
constructions that were uttered as answers to questions triggering either subject 
or object focus (§2.2.2). The following section described a perception experiment 
to investigate whether native speakers distinguish between different focus con- 
ditions (§2.2.3). 

In order to investigate the effect of focus condition on the production of syn- 
tactically identical constructions, I analyzed the phonetic parameters related to 
pitch, duration, and intensity (82.2.2.1-82.2.2.3). However, no significant effect 
of the focus condition was observed in any of these parameters. To exclude the 
possibility of focus condition being expressed by other means not tested here, I 
conducted a perception experiment to see whether native speakers distinguish 
between different focus conditions (§2.2.3). However, the results of the percep- 
tion experiment revealed no significant difference in the perception of syntacti- 
cally identical clauses recorded as answers to questions with different foci. Based 
on these findings, I conclude that focus is not prosodically encoded in the set of 
clauses analyzed. 

As the present investigation focuses on a controlled experimental analysis of 
focus marking in Totoli, it is important to note that in less controlled situations, 
speakers of Totoli use syntactic means and concomitant prosodic phrasing to 
express focus. While an in-depth investigation of this focus marking strategy is 
beyond the scope of this study, two examples are provided for illustration pur- 
poses. These instances are taken from a recording of an adapted version of the 
Anima elicitation game described in Skopeteas et al. (2006: 99-107), which was 
originally designed to elicit different focus types. In this game, participants are 
shown pictures and then asked questions about them. Examples (4) and (5) illus- 
trate focus marking in Totoli in this less controlled setting. 

In example (4), the speaker answers to a question asking about the patient 
of the situation (“In front of the well: What is the man pushing?”; adapted from 
Skopeteas et al. 2006: 103). The speaker replies with a cleft construction in or- 
der to mark the focus on the patient ətə ‘car’. The focused constituent occurs in 
sentence-initial position and is followed by a free relative clause, which is intro- 
duced by the relative particle anu ‘REL’. Figure 2.23 shows the periogram with 
pitch track (in st) of example (4). 


(4) a. oto 
oto 
car 


‘(it is a) car’ 
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b. anu laalau suludan tau moane dei dulak sgabbun ana 
anu laa-lau sulud-an tau moane dei dulak sgobbun ana 
REL RDP-presently push-APPL person male Loc front well MED 


‘that is currently pushed by the man in front of the well’ 
(QUIS-focus_SP.041-42) > 


105 - 


o IW 


sl SIA Aa J 


oto anu laalau suludan tau | moane | dei dulak ogo bbun ana 


Pitch in st (re 0.51 Hz) 


90 - 


0 1000 2000 3000 


Time 


Figure 2.23: Periogram with pitch track (in st) for example (4), speaker 
SP 


The pitch contour in Figure 2.23 depicts that the focused constituent ato ‘car’ 
is parsed into its own prosodic phrase, marked by a final rise-fall boundary-tone 
complex with a high target located at the beginning of the ultimate syllable. The 
relative clause anu laalau suludan tau moane dei dulak sgobbun ana “which is 
currently pushed by the man in front of the well’ is pronounced as one prosodic 
phrase with a final boundary-marking tonal complex, consisting of a low target 
located at the boundary between the penultimate and ultimate syllable, followed 
by a final high tone. 

Another example is provided in (5), where the speaker responds to a question 
about the agent of the situation (“In front of the well: Who is pushing the car?”; 
cf. Skopeteas et al. 2006: 103). Similar to the previous example, the speaker uses 
a cleft construction to mark the focus on the agent moane ‘man’, which is placed 
in the sentence-initial position, followed by a free relative clause introduced by 
the relative particle anu ‘REL’. Figure 2.24 shows the periogram with pitch track 
(in st) of example (5). 
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(5) moane anu lau monuludan oto 
moane anu lau moN-sulud-an ətə 
man REL presently AV.RLS-push-APPL car 
‘It is a man who is currently pushing the car. (QUIS-focus_SP.10) > 


120 - 
N 
x= 
S 110- 
2 
Ei 
= 100- AN 
= = 

90 J maane anu au monuludan ato 
0 500 1000 1500 
Time 


Figure 2.24: Periogram with pitch track (in st) for example (5), speaker 
SP 


The pitch contour in Figure 2.24 shows that the prosodic realization is similar 
to that of example (4) above. The focus constituent moane ‘man’ is chunked into 
its own prosodic phrase, clearly demarcated by a prosodic boundary in the form 
of a rise-fall boundary-tone complex with a high target located at the bound- 
ary between the penultimate and the ultimate syllable. The relative clause anu 
lau monuludan ota ‘who is currently pushing the car’ is uttered as one prosodic 
phrase with a final boundary-marking tonal complex that consists of a low tar- 
get located at the boundary between the penultimate and the ultimate syllable, 
followed by a final high tone. No pause or pitch reset occurs between the focus 
constituent and the relative clause. In cleft constructions such as example (4), 
the fronted constituent is necessarily chunked into its own prosodic phrase and 
therefore has to be demarcated with a phrase-final boundary tone. 

The prosodic marking appears to be concomitant to the syntactic marking 
of focus. Crucial to this discussion is the fact that in a controlled environment, 
Totoli allows syntactically identical SVO constructions as answers to different 
wh-questions that trigger narrow focus on either the subject or the object. Yet, 
in such constructions where the focus structure is not syntactically encoded, it 
is also not acoustically prominent. 
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With regard to purely prosodic strategies of marking focus, Lee et al. (2015: 
4754) comment that it “is less commonly recognized that purely prosodic mark- 
ing of focus may be much weaker in some languages than in others, to the extent 
that purely prosodic focus may be nearly absent as a general mechanism for com- 
munication of information structure”. Totoli apparently presents such a case. 


2.3 Conclusion 


The aim of this chapter was to investigate the role of prosodic prominence in 
the intonation of Totoli. The RPT experiment in §2.1 showed that participants 
generally do not agree on the judgment of prominences. Hence, similar to Papuan 
Malay, the results for Totoli make it “doubtful whether prosodic prominence can 
be usefully distinguished from boundary marking in this language” (Riesberg et 
al. 2018: 389). This was further tested by a subsequent focus marking experiment 
that included a production and a perception part, with the question whether 
speakers of Totoli employ purely prosodic means to mark SVO sentences with 
different focus structures. However, speakers of Totoli do not use purely prosodic 
means to express focus. 

The results from the RPT experiment and the focus marking experiments show 
that prosodic prominence does not play a role in the prosodic system of Totoli. 
In other words, Totoli does not employ postlexical pitch accents to mark focus, 
which is similar to other Austronesian languages in the region (cf. Maskikit- 
Essed & Gussenhoven 2016 on Ambonese Malay, Riesberg et al. 2018 on Papuan 
Malay, Goedemans & van Zanten 2007 on Standard Indonesian). One of the few 
studies that specifically investigated phrasal prominence and the realization of 
focus is Maskikit-Essed & Gussenhoven (2016), who conducted a study on Am- 
bonese Malay. They elicited scripted speech in the form of question-answer mini- 
dialogues that were controlled for focus condition. In the target words analyzed, 
they came to the conclusion that “in effect, this means that Ambonese Malay does 
not express focus in its prosody” (Maskikit-Essed & Gussenhoven 2016: 383) and 
that there is no evidence for stress in general in that language. Furthermore, they 
hypothesize that their findings may actually apply also to other Malay varieties 
Maskikit-Essed & Gussenhoven (2016: 391). The absence of (post)lexical stress, 
however, may not only be a feature found in many Indonesian/Malay varieties 
but may likewise be found in many other languages of the region, as the results 
from Totoli suggest. 

The experimental results obtained here are highly relevant for the following 
chapter, in which I turn to an investigation of the tonal realizations of CIUs of a 
corpus of Totoli (§3.2.1-§3.2.4), and in which I propose an intonational model of 
the CIU (§3.2). 
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3 Corpus-based approaches to the 
intonation of Totoli 


In this chapter, I investigate the segmentable prosodic units as they occur in 
the corpus of (semi-)spontaneous speech of Totoli. Based on the analysis of tonal 
patterns in §3.2, I conclude that in Totoli we have to assume recursive embedding 
of IUs into complex Compound IUs rather than [Us that are parsed into lower- 
level prosodic units. Hence, the label CIU here. The CIU in my analysis is hence 
equivalent to the label IU in other studies (Croft 1995, 2007, Tao 1996, Park 2002, 
Schuetze-Coburn et al. 1991, Schuetze-Coburn 1994, Matsumoto 2003, Iwasaki 
1996, Iwasaki & Tao 1993, Wouk 2008). For the sake of clarity and readability, I 
use the label CIU for both complex CIUs that consist of several embedded IUs 
and also for singleton [Us (see Figure 3.4). 

In §3.1, I describe some of the fundamental properties of the CIU in the corpus, 
including their categorization, distribution, and length. Section §3.2 presents an 
intonational model and discusses the tonal specifications of boundaries of proso- 
dic units. Finally, in §3.3, I investigate the prosody-syntax interface by examining 
the syntactic content of CIUs as a whole ($3.3.1) and the embedded IUs of CIUs 
in particular ($3.3.2). 


31 Properties of the Compound Intonation Unit in Totoli 
in a cross-linguistic perspective 


From the discussion in 81.4, it is clear that the IU is “the spoken-language ana- 
Iyst's most popular unit of choice for analysis” (Croft 1995: 841) and it is consid- 
ered the basic unit structuring discourse. Furthermore, it is locally managed and 
“different sizes of IUs are used in different interactional contexts” (Park 2002: 
674). This section explores some properties of the 3226 CIUs in the corpus of 
Totoli and compares these with data reported for other languages. 


3 Corpus-based approaches 


3.1.1 The corpus 


The corpus consists of 21 selected recordings that were segmented and annotated 
by me according to the criteria described in §1.4. All recordings were made using 
head-mounted microphones worn by the consultants, with an additional record- 
ing on the built-in camera microphone. Video and audio were recorded with a 
Zoom Q8 audio/video recorder with two external AKG C520 head-mounted con- 
denser microphones at a sampling rate of 48 kHz. 

I distinguish between conversational and monological data, as they are often 
theorized to differ substantially in various ways, such as “information pressure” 
(Du Bois 1987: 836). In this seciton, frequency distributions of various phenomena 
are displayed for the entire corpus, as well as for conversational and monologi- 
cal data separately. Inclusion of both types of data ensures that certain prosodic 
phenomena that may occur infrequently, if at all, in either genre are accounted 
for. Analyzing these data categories individually allows us to investigate how 
various types of genres influence the frequency of tonal events and syntactic 
structures. By taking this approach, we can draw more nuanced and detailed 
conclusions about the relationship between prosodic and syntactic phenomena 
and discourse genre. 

The corpus includes recordings from 15 speakers: 11 male and 4 female. Further 
information on the speakers are given in Table A.6 and Table A.5 in the Appendix. 
Table 3.1 below gives an overview of the recordings. 


Table 3.1: Overview of recordings 


interactivity type n CIUs duration 


conversational 1327 00:40:08 
monological 1899 01:30:08 
total 3226 02:10:16 


Read, elicited, or otherwise highly planned speech such as data obtained from 
a laboratory setup are not included in the corpus. I used this type of data only in 
the experiments on focus marking, which are described in §2.2. 

Obtaining natural conversational data while speakers are wearing head-mounted 
microphones is rather difficult, especially when working with endangered lan- 
guages such as Totoli. To overcome this difficulty, I used two different elicitation 
games to obtain the conversational data: 
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The Animal Game is an elicitation game described in Skopeteas et al. (2006: 111— 
117) with the original purpose of eliciting narrow/contrastive focus. A 
stack of cards with photos is divided equally between two speakers who 
take turns in describing the different pictures on the cards. I also used the 
game in a monological setting with one speaker only. 


Man and Tree & Space Games is a classic elicitation game, originally designed 
to explore spatial reference in field settings (Levinson et al. 1992). It has 
proven to be a very interactive task where two participants are each pre- 
sented with identical sets of 12 cards displaying different items. Without 
seeing the interlocutor’s stack of photos, both participants have to describe 
the relevant details of a certain card to find an exact match. Once they are 
certain they have found the matching one, they put the card aside. After all 
cards have been described, they are checked to see whether or not the cards 
match. Participants usually, but not obligatorily, take turns in describing. 
The game involves four rounds. 


The monological data comprise recordings of various genres. All data were 
recorded in a face-to-face situation with a local member of the Totoli documen- 
tation team, with both parties wearing head-mounted microphones. The record- 
ings were of different types: 


The Pear Story is a short movie, designed by Chafe (1980). The corpus contains 
several recordings of retellings of the film. Participants watch the movie 
first and are then asked to narrate the story-line. 


Anima is an elicitation game described in Skopeteas et al. (2006: 99-107) with 
the original purpose being to elicit different focus types. In the game, par- 
ticipants are asked a set of questions about the photos they are seeing. 


The Animal Game is an elicitation game described in Skopeteas et al. (2006: 111— 
117) for eliciting narrow/contrastive focus. The speaker receives a stack of 
cards with photos on them and then simply describes the different pictures 
on the cards. 


Stories and folktales were recorded with three different speakers. These include, 
firstly, a recording of a lengthy account of a particularly memorable period 
of the speaker’s life, and, secondly, five folk tales. 


Explanatory texts were obtained from two different speakers, each describing 
an important cultural event, namely wedding traditions and a special rit- 
ualized singing game called Lelegesan. In this game, two or more singers 
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“spontaneously produce as many rhyming two-liners as possible” (Ries- 
berg 2019: 83). 


It is important to note that the language studied in this research is endangered 
and has only few remaining speakers. In such circumstances, opportunities for 
data collection are limited, and an optimal setup may not be attainable. In this 
study, this led to, e.g., an unavoidable imbalance between conversational and 
monological data, which must be taken into account when interpreting the re- 
sults presented in this chapter. 


3.1.2 Distribution of Chafeian IU types in the corpus 


Chafe (1994: 63-64) proposes a categorization scheme of [Us into different types: 


Substantive IUs are those which convey events, states or referents 
(e.g. examples (47b), (43a), (15), (17)). 


Regulatory IUs regulate interaction and the flow of information 
(e.g. examples (42a), (43b), (46a), (44b)). 


Fragmentary IUs are unsuccessful IUs that are truncated or abandoned (e.g. (25c)). 


Regulatory IUs are further grouped into textual (e.g., and, then, well), interac- 
tional (e.g., mhm, you know), cognitive (e.g., let me see, oh) and validational (e.g., 
maybe, I think) (Chafe 1994: 65). In Totoli, frequent items of this category are dis- 
course particles and interjections such as mh, io, aih. Other items included are 
connectors such as bali ‘so/then’ and the filler element anu. 

Table 3.1 shows the distribution of the three types in the corpus of Totoli. The 
category “Rest” includes uncodable or unintelligible utterances. 

The frequency distribution indicates that 71.9% of the 3226 CIUs in the corpus 
are of the substantive type, while 15.9% are regulatory and 6.6% are fragmentary 
CIUs. The majority of CIUs in both settings are of the substantive type. How- 
ever, the distribution of CIU types within each setting differs: the proportion 
of regulatory CIUs is 12.8 percentage points higher in conversational data than 
in monological data, and the proportion of substantive CIUs is 12.7 percentage 
points lower. 

The frequency distribution of Chafeian IU types is rarely reported. However, 
in a study on Japanese two-party conversations, Matsumoto (2003: 50) found that 
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all conversational monological 
n= 3226 n= 1327 1899 


HA? 
ns oe tn 1485) 
18.6% 
un ie ara pa) ap 5.5% 


Substantive Regulatory Fragmentary Substantive Regulatory Fragmentary Rest Substantive Regulatory Fragmentary Rest 


dlatibuton of occurences in % 
D 
3 


Figure 3.1: Frequency distribution of CIU types 


81% of the IUs were substantive, 17% were regulatory, 0.9% were fragmentary, and 
1.1% were uncodable. 

Chafe (1994: 63) has suggested that this categorization “is useful because cer- 
tain aspects of an analysis can be directed at one of these types to the exclusion 
of the others.” However, the distinction between regulatory and substantive IUs 
is not always clear-cut. While Croft (1995, 2007) agrees that the IU is the basic 
discourse unit, he does not adopt the Chafeian categorization and provides his 
own set of criteria. Similarly, Tao (1996: 59) has stated that “in order to avoid 
any arbitrary decisions, I have chosen not to discriminate between the two types 
of IUs but instead provide a detailed grammatical taxonomy of all IUs? For the 
Totoli corpus, I have utilized the Chafeian classification. Overall, I found that the 
vast majority of IUs were easily classifiable. 


3.1.3 The length of Intonation Units of the corpus 


The length of IUs has received a lot of attention in the literature and has been 

used as a central argument for the IU as cognitive unit (see §3.3). Yet, little effort 

has been made to measure the length of an IU, as it is not a straightforward 

task. Several ways of measurement are conceivable. Chafe (1994: 64) discusses 

the number of words an IU contains as the “simplest and most obvious measure.” 
With regard to English, one is left with several figures: 


e In an early publication, Chafe (1980: 14) states the average length of all IUs 
taken together to be about 6 words. 


e In Chafe (1988: 42), he suggests that ”(t)he intonation units of ordinary spo- 
ken language show a relatively constant length [...]. In English the mean 
length of an intonation unit is between 5 and 6 words.” 
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e In Chafe (1993: 39), he states that “one finds the modal length of regulatory 
units to be one word and that of substantive units to be five words.” 


e In his account of the English Pear Story corpus, Chafe (1994: 64) finds Reg- 
ulatory IUs to have a mean length of 1.36 words, and Substantive IUs have 
a mean of 4.84 words with a modal length of 4. 


e Croft (1995), referencing Altenberg (1987: 282) and Crystal (1976: 256), re- 
ports counts of 4 and 6 words, respectively. 


Disregarding the counts for regulatory IUs, we can summarize that an IU in 
English roughly contains about four to six words. Importantly, these counts hold 
for English only and languages vary with regard to the average number of words 
an IU contains. 

With regard to typologically diverse languages, mean numbers of words per 
IU vary from two to five. 


e Tao (1996: 52-54) reports an average of 3.5 and a modal length of 3 words 
for Mandarin Chinese. 


e Chafe (1994: 148) reports a modal and an average length of 2 words for 
English. 


e Himmelmann et al. (2018: 224) report average numbers of words per IU 
in 4 typologically unrelated languages: 5.13 for German, 3.73 for Papuan 
Malay, 3.37 for Wooi, and 3.44 for Yali.! 


Commenting on the difference in length of IUs in the four investigated lan- 
guages, Himmelmann et al. (2018: 222) note that the main difference lies in the 
grammatical structure, but that the orthographic conventions of the languages 
also play a role. 

Figure 3.2 shows their distribution of substantive CIUs in the Totoli corpus 
and gives information on their length, measured in number of words. Both con- 
versational and monological CIUs are included. The distribution of CIUs shows 
that the modal length is 2 words in conversational data and 3 words in monologi- 
cal data. Monological data is less skewed and CIUs with more than one word are 
much more common. 

Measuring the length of IUs in words has its drawbacks. As the data for Totoli 
show, the length of CIUs varies substantially according to the data type used, 


‘Only the consensus values of expert annotators are given here. 
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Figure 3.2: Distribution of substantive CIUs of the corpus according to 
lengths in words 


even within the same language. Furthermore, the grammatical structure and or- 
thographic conventions of any given language will result in different counts that 
may render a comparison difficult. 

As Fenk-Oczlon & Fenk (2002: 222) put it: 


In languages with a pronounced tendency to synthetic (agglutinative or 
fusional) morphology we have to expect a lower number of words per in- 
tonation unit (and in polysynthetic and incorporating languages even one 
long word that we would encode in a sentence comprising 5 or 6 words.) 


This touches on topics of wordhood (Tallman 2020) and alternative measures 
of the length of an IU have been suggested. Research on language acquisition 
has long focused on the length of “utterances” and on the question of what to 
measure (for an overview of different measures, see Allen & Dench 2015). 

A possible alternative is to measure the number of syllables an IU contains 
(Schuetze-Coburn 1994: 161). Himmelmann et al. (2018) propose another alterna- 
tive for measuring the length of IUs in terms of content words. Probably the most 
straightforward way, yet also the one most susceptible to speaking style and in- 
dividual speaker differences, is to measure the length of IUs in terms of duration 
in seconds. Fenk-Oczlon & Fenk (2002: 221) cite Maatta (1993), who reports an 
average length of a “breath group” of about 2.1-2.2 seconds. Chafe (1980) reports 
a mean length of 2 seconds (incl. pauses). 

I include descriptive statistics of the CIU in Totoli in terms of the different 
measurements — number of words, syllables and total duration - for each of the 
Chafeian (1994) IU types; see Table 3.2. 

In this section, I have described some of the fundamental properties of the IU 
in general and the CIU in Totoli specifically. In the next section, I analyze the 
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Table 3.2: Description of CIUs in the corpus in terms of number of 
words, syllables and total duration in seconds, divided over Chafeian 
(1994) IU-types and data types 


all conversational monological 
n n n n n n 
d d 
words syll word syll TT words syll ER 
substantive 
mean 3.81 8.90 148 343 7.80 1.26 404 9.54 1.61 
median 3 8 1.24 3 7 107 3 8 1.35 
sd 2.35 5.31 0.92 2.35 4.39 0.77 199 5.68 0.97 
regulatory 
mean 1.24 186 0.52 1.2 1.85 0.50 1.3 1.87 0.57 
median 1 1 0.40 1 2 0.37 1 1 0.48 
sd 0.77 168 0.61 0.55 1.19 0.70 1.05 2.31 0.43 
all 
mean 3.24 731 1.26 2.75 591 1.02 3.60 8.35 1.45 
median 3 6 1.04 2 5 0.84 3 7 1.21 


sd 232 5.46 0.93 1.95 452 080 1.50 5.86 0.98 


tonal specifications of CIUs in the corpus and propose an intonational model 
thereof. 

In order to conduct a corpus-based analysis, I segmented the corpus into Com- 
pound Intonation Units, for which I described the criteria in §1.4. Whether native 
and naive listeners actually perceive the segmented Compound Intonation Units 
as such can be answered by revisiting the results from the RPT experiment (§2.1). 
Table 2.1 in §2.1.1 provides an overview of the stimuli used in the RPT experiment. 
In the experiment, participants rated 71 speech samples, of which 26 consisted of 
2 CIUs and 4 consisted of 3 CIUs, according to the segmentation criteria applied 
to the corpus. Hence, I collected boundary judgments for 105 words in final posi- 
tion of a CIU. The 71 boundary judgments for words in final position of a speech 
sample were discarded, resulting in 34 boundary judgments for CIU-final but not 
stimulus-final words. 

Figure 3.3 shows the b-scores (the proportion of speakers who marked the 
respective word as prominent or as preceding a boundary, cf. §2.1.4) for words in 
CIU-final position in comparison to b-scores for words in CIU-internal position. 
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Words in IU-final position Words in |U-internal position 
n= 34 n= 379 
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Figure 3.3: Comparison of b-scores of words in CIU-final and in CIU- 
internal position: median = 10, mean = 20.91 for words in CIU-internal 
position; median = 90, mean = 85.44 for words in CIU-final position; 
words in final position of a speech sample are excluded 


The boundary scores for words that occur in CIU-final position are substan- 
tially higher (median = 90; mean = 85.44) than those for words in CIU-internal 
position (median = 10; mean = 20.91). This correlation shows that what I consid- 
ered a Compound Intonation Unit in the segmentation of the corpus is actually 
perceived as a unit, and thus confirms the viability of units obtained from the 
corpus segmentation. 


3.2 An intonational model of the Compound Intonation 
Unit in Totoli 


Having described the corpus in detail, I will now turn to the study of Totoli in- 
tonation. This study focuses on the Compound Intonation Unit, which is - to 
repeat Chafe’s (1987: 22) definition of an Intonation Unit — a “sequence of words 
combined under a single, coherent intonation contour, usually preceded by a 
pause.” 

In this section, I propose an intonational model of the CIU in Totoli. The model 
is couched in the autosegmental-metrical framework and the ToBI framework 
(Arvaniti & Fletcher 2020, Jun 2005c, 2014a, Ladd 2008) and assumes singleton In- 
tonation Units or Compound Intonation Units. The model takes up Ladd’s (2008: 
297; chapter 8.2) notion of the Compound Prosodic Domain (CPD), i.e. strings of 
IUs which are recursively embedded and together form a CIU. Singleton IUs or 
CIUs are strings of words that are combined under a coherent intonation con- 
tour, usually preceded by a pause (Chafe 1987: 22). They are perceived as such by 
listeners due to a complex interplay of cues mainly pertaining to pitch, rhythm 
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and voice quality (Schuetze-Coburn 1994: 93-155, Himmelmann 2006: 260-270, 
Du Bois et al. 1992, 1993, and Cruttenden 1997: 29-39). Tonal specifications are 
assigned at the level of the IU, and they are associated with their right-edge 
boundary and consist of a bitonal edge-tone complex. The right-edge of an IU — 
a singleton IU or an embedded IU of a CIU - is demarcated by one of the three 
proposed boundary tones (see §3.2.1 and §3.2.2). In a CIU, only the last embedded 
IU is followed by typical final cues, such as e.g. pause and pitch reset (Schuetze- 
Coburn et al. 1991: 217). Tonal specifications of singleton IUs and embedded IUs 
are equal and vary only with regard to their frequency distribution (see §3.2.5). 

The intonational model for singleton IUs and CIUs in Totoli is shown in Fig- 
ure 3.4. It brings together insights from the two experiments described in Chap- 
ter 2 and findings from a large scale investigation of tonal events and syntactic 
content of the prosodic units presented in the remainder of this chapter (§3.2.1- 
§3.2.4). 


T-T% T-T% T-T% 
(ocooo|) [ocoocoo] [ooo o o] 


T% = boundary tone 


T- = phrase tone 


Figure 3.4: The CIU in Totoli: A singleton IU on the left-hand side, re- 
cursively embedded IUs into a Compound Intonation Unit (CIU) on the 
right-hand side. 


The model analyzes IU-final pitch events as bitonal boundary-tone complexes, 
consisting of a phrase tone (T-) and a boundary tone (T%). In §3.2.2, I show that 
the right-edge boundary of an embedded IU of a CIU is not merely classified by 
a single boundary tone such as an H% or T%. Instead, I propose a set of three 
different final boundary-tone complexes that are essentially the same as those 
occurring at the right-edge boundary of the last IU of a CIU or a singleton IU 
respectively. 

In the model, I use the more theory-neutral label phrase tone instead of phrase 
accent that is typically anchored to a metrically strong syllable (Grice et al. 2000). 
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The alignment of this tone is roughly the prefinal syllable but lacks near-constant 
timing (see Maskikit-Essed & Gussenhoven 2016: 356). This will become obvious 
from the discussion in the following sections §3.2.1 and §3.2.2. 

The IU - singleton IUs and embedded IUs of CIUs — regularly maps onto syn- 
tactic or grammatical units, such as a subject or object NP, a verb or a VP, an 
adverbial phrase or a complement clause. This observation will be discussed in 
§3.3. 

The only tonal event in singleton IUs is the obligatory final boundary-tone 
complex. In CIUs, each embedded IU is marked by a final boundary-tone com- 
plex. The syntactic content is decisive in whether a construction is uttered as a 
singleton IU or a CIU consisting of several embedded IUs (see §3.3). Hence, the 
difference between embedded IUs of CIUs and non-embedded IUs - singleton 
IUs or final IUs of CIUs - lies in their co-occurrence with other boundary phe- 
nomena such as pitch reset, final lengthening, pauses and glottalization. In the 
remainder of the work, I will use CIU as a cover term to refer to both single- 
ton IUs and Compound IUs - hence, those which co-occur with other boundary 
phenomena — as opposed to embedded IUs of CIUs. 

An important observation is that the tonal marking at the right-edge bound- 
ary of singleton IUs, embedded non-final IUs of CIUs, and final IUs of CIUs is 
essentially the same, as demonstrated in the following sections $3.2.1 and $3.2.2. 


3.21 Tonal events at the right-edge boundaries of CIUs 


In this section, I discuss tonal events occurring at the right-edge boundary of 
CIUs, i.e., the final IUs of CIUs and singleton IUs respectively, as exemplified in 
Figure 3.5. 


PEL 


IU 


T-T% 
[ooo O o]# 


Figure 3.5: Visualization of final IUs of CIUs and singleton IUs dis- 
cussed in this section 


Tonal events at the right-edge boundary of CIUs and singleton IUs can be 
classified into a set of three different tonal contours. To classify and annotate 
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their pitch contours, I visually inspected all IUs. The summarizing plot of these 
is shown in Figure 3.6. The three contours are explained in the following. 


antepenultimate penultimate ultimate 


Figure 3.6: Spaghetti plot showing the intonation contours of the fi- 
nal three syllables of each of the three final boundary-tone complexes 
with the items superimposed on each other and with an average con- 
tour produced by Loess smoothing in R (Team 2017). Vertical bars indi- 
cate syllable boundaries; only CIUs with final CH CH CH |cyy# syllable 
structure are displayed; values are z-transformed for CIU. 


In the model proposed here, the final boundary-tone complex is analyzed as 
consisting of a phrase tone (T-) and a boundary tone (T%). The phrase tone can 
be a low tone L-, a high tone H- or a rising tone LH-. The boundary tone can be 
either a low tone L% or a high tone H%. Figure 3.7 depicts schematic versions of 
the three possible combinations of a phrase tone and a boundary tone, including 
two rising patterns and one falling pattern. 

The two rising patterns are rather similar and their main difference is the do- 
main of the pitch rise, or the low tone L- respectively. However, they are clearly 
distinct in their function, as I show below (cf. Table 3.3). 

In the following section, I will provide a brief summary of each contour and 
illustrate them with examples from the corpus. After that, I will discuss their 
functions and distribution in the corpus. For the purpose of exemplification, only 
singleton IUs will be displayed in the figures below. 


60 


0 


ai 


ai 


L- 


3.2 Intonational model 


LH-H% 


o Je? 


L% 
o Oo Jciu# ae oO 


LH- 


ai 


H% 


o Jong 


Figure 3.7: Schematic representation of CIU-final boundary-tone com- 
plexes: vertical bars indicate syllable boundaries. 


3.2.1.1 The L-H% boundary-tone complex 


An instance of the L-H% boundary-tone complex is given in the periogram of 
example (1) in Figure 3.8. 

Pitch starts around the middle of the speaker’s current range. Over the initial 
15 syllables, pitch remains near level and then drops 4 st towards the low target 
of the L- phrase tone located at the boundary between the penultimate (.ni.) and 
the ultimate syllable (.pu#) of the IU. On the last syllable, pitch rises almost 19 st 
to the high target of the H% boundary tone. 


1154 A 
N 
L 
5 110- 
o 
2 
7105] ee As 
| ana ` > 
5 ) ee (a, 
= 100 - 
daan tooka nemenek sia lau memenek naafikmo lau monipu 
95- | 
0 1000 2000 3000 
Time 


Figure 3.8: Periogram with pitch track (in st) for example (1), with final 
boundary-tone complex L-H%, speaker SELP 
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(1) daan toska nemenek isia lau memenek naafik lau monipu 
daan tooka  noN-penek  isia lau moN-penek no-afik 
later finished Av.RLS-climb 3s presently Av-climb ` sT.RLS-busy 
lau moN-tipu 
presently av-pick 
‘after he climbed; eagerly picking (pears)’ (pearstory 36 SELP.015) » 


Speaker SELP shows a very high pitch range, especially on the final syllable 
(.pu#). This is not uncommon in the corpus and is frequently observable when 
speakers are very engaged in their conversation or narration (cf. example (15)) / 
Figure 3.29, example (18) / Figure 3.32, example (17) / Figure 3.31). 


3.2.1.2 The (L)H-L% boundary-tone complex 


I analyze the combinations of an LH- or H- phrase tone with the boundary tone 
L% as variations of the same tonal complex, referred to here as (L)H-L%. The 
difference resides in the domain of the pitch rise, as indicated by the dashed line 
in Figure 3.7. In the LH-L% variant, the main domain for the pitch rise is the 
penultimate syllable. In the H-L% variant, on the other hand, the rise in pitch 
may extend over several syllables. 

Consider first the LH-L% variant, which is exemplified by the pitch contour of 
example (2) in Figure 3.9. Pitch begins around the middle of the speaker’s current 
range and remains near flat with a slight downtrend over the first 7 syllables of 
the IU. Starting at the beginning of the penultimate syllable (.du.), pitch rises 10 
st to the high target of the LH- phrase tone at the beginning of the vowel of the 
last syllable (.na#). Pitch then drops 12 st to the low target of the L% of the IU 
boundary tone. 


(2) tauna dei anu baduna 
tau-0=na dei anu badu=na 
put-UV=3S.GEN LOC FILL shirt=3s.GEN 


‘it is fixed at his whatchamacallit shirt’ (pearstory 12 RSTM.055) > 


The H-L% boundary-tone complex is illustrated by (3), for which the peri- 
ogram is shown in Figure 3.10. The rise to the high target does not occur on 
the penultimate syllable exclusively, but instead happens gradually. Hence, the 
contour is labeled H-L%. 

The pitch contour in Figure 3.10 shows that pitch begins mid-level of the 
speaker’s current range and gradually rises about 4 st over the initial 6 sylla- 
bles of the IU. The pitch rise reaches the high target of the H- phrase tone at the 
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Pitch in st (re 0.51 Hz) 


0 250 500 750 1000 


Time 
Figure 3.9: Periogram with pitch track (in st) for example (2), with final 
boundary-tone complex LH-L%, speaker RSTM 


beginning of the vowel of the ultimate syllable (.ka#) and then drops about 15 st 
to the low target of the L% boundary tone. 


2 105 - p 
: Ta A— 
vi 100 cl VA 1 
Y 
© 95- 
$ 
= 
90 el saasalu kololannako 
0 500 1000 1500 2000 
Time 


Figure 3.10: Periogram with pitch track (in st) for example (3), with 
final boundary-tone complex H-L%, speaker SP 


(3) saasalu koloannako 
RDP-salu koloan=na=ko 
RDP-to.face right=3s.GEN=AND 


“it is facing to the right side” (spacegames_sequence4_KSR-SP.035) > 
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3.2.1.3 The LH-H% boundary-tone complex 


An instance of the LH-H% boundary-tone complex is found in example (4), for 
which the periogram shown in Figure 3.11. Speaker ZBR enumerates several cul- 
tural events and festivities. Similar to example (3), the domain for the pitch rise is 
the penultimate syllable. On the last syllable, pitch remains (near) high. Similar 
to the L-H% above, the pitch contour shows a slight dip towards the end of the 
IU. This even visible in the summarizing spaghetti plot in Figure 3.6 and can be 
interpreted as an anticipation of the following low tone. 


— 104 - , 
2 100- 
2 
E 96 - AY 
E Y bs 
2 92 7 manabin mongulan ballamate 

88 - 

0 1000 2000 3000 4000 
Time 


Figure 3.11: Periogram with pitch track (in st) for example (4), with final 
boundary-tone complexes LH-H%, speaker ZBR 


(4) a. manabing 
moN-kabing 
AV-marry 
‘to marry’ 

b. mongulan 
moN-gulan 
AV-to.cradle 
‘to cradle’ 

c. ballamate 
ballamate 
funeral.ceremony 
‘the funeral ceremony’ 


(explanation-wedding-tradition_ZBR.265-267) > 
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While the LH-H% boundary-tone complex and the L-H% boundary-tone com- 
plex share some similarities, the main difference lies in the pitch rise domain. 
The two patterns can be easily differentiated from each other. 


3.2.1.4 Distribution 


The three boundary-tone complexes are the main tonal events occurring at the 
right-edge boundary of CIUs, i.e. singleton IUs and final IUs of CIUs. In addition 
to final boundary-tone complexes, there are rarely occurring discourse particles 
which attach to one of the boundary-tone complexes. They are not included in 
Figure 3.12 but are described in §3.2.4. 

The frequency distribution of the different final boundary-tone complexes 
over the different data types is shown in Figure 3.12. Clearly evident is the fact 
that the (L)H-L% and the L-H% boundary-tone complexes are the two major pat- 
terns. The LH-H% pattern is a minor pattern but is very distinct in its function, 
as I will show in the next section (§3.2.1.5). 
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Figure 3.12: Frequency distribution of tonal events at the right edge of 
substantive CIUs within conversational and monological recordings, 
numbers are rounded to one decimal place. 
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The frequency distribution within the two data types is similar, with both 
types attesting more (L)H-L% boundary tones than L-H%. The distribution is 
more heavily skewed in conversational data, in which 60.1% of CIUs occur with 
the (L)H-L% boundary tone and 29.3% with the L-H%. This trend is less pro- 
nounced in the monological data, where there is a difference of only 10 percent- 
age points. Furthermore, monological data show more final LH-H%. 


3.2.1.5 Function 


The difference in distribution can be explained by the different functions of CIU- 
final boundary-tone complexes. These are summarized in Table 3.3. 
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Table 3.3: Summary of functions of CIU-final boundary-tone com- 


plexes 
CIU-final boundary-tone complex function 
L-H% continuation 
(L)H-L% finality 
LH-H% non-final elements of lists 


Monologues yield a higher proportion of CIUs with the LH-H% pattern be- 
cause, in descriptive texts in the corpus, speakers frequently enumerate refer- 
ents, events or entities; see example (4) above. In lists, especially for non-final 
elements of lists, the LH-H% is the preferred intonation pattern. However, the 
pattern is not exclusively reserved for such CIUs but can also occur in other en- 
vironments where the speaker wants to express a high degree of continuation. 
Similarly, (non-final) elements of lists can also be uttered with the L-H% when 
signaling less strong continuation. 

The distributional differences between the boundary-tone complexes observed 
in Figure 3.12 support the proposed functions in the following way: The higher 
proportion of CIUs with final (L)H-L% in conversations as compared to mono- 
logues reflects the fact that in conversations a paragraph may consist of a ques- 
tion and an answer only, while in monologues narrations are organized into 
longer paragraphs, containing several CIUs. 

The function of (L)H-L% as signaling finality and L-H% as signaling continua- 
tion is nicely illustrated by Tail-Head Linkage (THL) constructions in narrations. 
de Vries (2005: 262) defines THL as 


[...] a way to connect clause chains in which the last clause of a chain is 
partially or completely repeated in the first clause of the next chain. 


In Totoli, instances of THL occur mostly in unplanned narratives and con- 
tribute to what de Vries (2005: 378) calls processing ease, as it links paragraphs 
and maintains event coherence. At the same time, they serve as a planning device, 
allowing speakers more time to plan the next paragraph. In the corpus, record- 
ings of retellings of the Pear Story yield a considerable number of instances of 
THL constructions, as speakers are given the task to extemporize a coherent ac- 
count of the story-line of a previously unknown story. In the Pear Story retellings, 
CIUs are usually grouped into higher-level units above the CIU which may be 
termed “paragraphs” (see Himmelmann & Ladd 2008: 251). A paragraph consists 
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of a series of CIUs ending on L-H% and concludes with a final CIU marked by the 
(L)H-L% pattern. In a THL construction, the final CIU - the Tail of a paragraph 
— is repeated in full or in part as Head of the subsequent paragraph. The excerpt 
of a Pear Film retelling in example (5a)—(5n) provides an illustration. 


(5) a. bali tau pagauan L-H% 
‘So a gardener ... 
b. <na> kononipu alpukaatna L-H% 
‘... is picking the avocados; 
c. nipenekanna dei batanna L-H% 
‘he is climbing up the trunk; 
d. niambinnamai uliai bab» L-H% 


‘he is getting (the avocados down) from the top, 


e. sagaat nadabumai dei buta L-H% 
‘half (of the avocados) fell to the ground, 

f. bai indzan nakaalamai L-H% 
‘and then after (he) picked (them) up, 

g. ninauna poniai <moi> nitauna dei karangan L-H% 
‘he brought them there and put them in the basket, 

h. dei llenget L-H% 
‘in the hamper, 

i. kaddaan tau L-H% 
‘(then) there was a person, 

j- notumalibko L-H% 
‘(he) passed by, 

k. biibindas taalan H-L% 
‘(and he) pulled a goat: 

l. biibindas toolan L-H% 


‘(Though the person) pulled a goat, 
m. tapi ganega tumalibko H-L% 
‘he only passed by. 
n. inga daan noosa kaddaanmai mannana saasapeda L-H% 
‘Not long after that, there came a child, cycling’ 
(pearstory 11 SP.001-014) > 
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Examples (5a)-(5k) form one paragraph and are each uttered with the L-H% 
boundary-tone complex, marking non-finality. The final CIU of the paragraph 
— the Tail of the paragraph - is (5k) biibindas toolan ‘pulling a goat’, which is 
uttered with the H-L% boundary-tone complex. It is repeated as the Head of 
the subsequent paragraph, this time bearing the L-H% boundary-tone complex. 
The realization of the two CIUs of the THL construction in (5k)-(51) above are 
displayed in Figure 3.13. 
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Figure 3.13: Periogram with pitch track (in st) for example (5k)-(51), 
realization of the two segmentally identical CIUs of the THL construc- 
tion; speaker SP 


These examples support the hypothesis that (L)H-L% signals finality, while L- 
H% serves as “continuer”, signaling non-finality of a CIU with regard to a higher- 
level discourse unit. Note that both contours occur in interrogative as well as 
declarative sentences. 


3.2.2 Tonal events at the boundaries of non-final, embedded IUs of 
CIUs 


This section deals with tonal events at the right-edge boundaries of non-final IUs 
of CIUs. This is exemplified in Figure 3.14 below. 

Tonal events at the right-edge boundary of non-final, embedded IUs of CIUs 
can be egually classified into three different tonal contours. The summarizing 
plot of the three boundary-tone complexes is shown in Figure 3.15. 

The final boundary-tone complex consists of a phrase tone (T-) and a bound- 
ary tone (T%): the phrase tone is a low tone L-, a high tone H- or a rising tone 
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T-T% T-T% 
(ocooo| lsoooo| 


Figure 3.14: Visualization of non-final IUs of CIUs being discussed in 
this section 


ntepenuitimate-—-penultimate ultimate initial 


Figure 3.15: Spaghetti plot showing the intonation contours of the 
final three syllables of each of the three final boundary-tone com- 
plexes of embedded and non-final IUs with the items superimposed 
on each other and with an average contour produced by Loess smooth- 
ing in R (Team 2017): vertical bars indicate syllable boundaries; only 
final CV.CV.CV];,[CV... syllable structure are displayed; values are z- 
transformed for IU. 


LH-, while the boundary tone can be either a low tone L% or a high tone H%. Fig- 
ure 3.16 depicts a schematic version of the three final boundary-tone complexes 
of non-final, embedded [Us of CIUs. In the following, I briefly summarize each 
boundary-tone complex. Subsequently, I illustrate them with examples and then 
describe their distribution in the corpus. 

The three final boundary-tone complexes of embedded IUs of CIUs are illus- 
trated with examples from the corpus. For the purpose of exemplification, only 
CIUs consisting of two embedded IUs are used. Hence, they only have a single 
CIU-internal tonal event, i.e., that of the non-final, embedded IU. 
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L-H% 


H% 


Figure 3.16: Schematic representation of final boundary-tone com- 
plexes of embedded IUs: vertical bars indicate syllable boundaries 


3.2.2.1 The L-H% boundary-tone complex 


An instance of the final boundary-tone complex L-H% is found in example (6), 
for which the periogram is shown in Figure 3.17. 

The initial word of the CIU, sapeda ‘bicycle’, forms its own embedded IU, de- 
marcated by the final boundary-tone complex L-H%. Pitch starts around the mid- 
dle of the speaker’s current range and it reaches the low target of the L- phrase 
tone located at the boundary between the penultimate (.pe.) and the ultimate syl- 
lable (.da) of the IU. On the ultimate syllable, pitch rises 6 st to the high target of 
the H% boundary tone. 
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Figure 3.17: Periogram with pitch track (in st) for example (6), example 
of final boundary-tone complex L-H% on embedded IU sapeda, speaker 
SP 
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(6) sapeda | nollumpakmoko dei batu 


sapeda no-RDP-lumpak=mo=ko dei batu 
bicycle ST.RLS-RDP-hit.against=CPL=AND LOC stone 
‘the bicycle hit the stone’ (pearstory 14 SP.028) > 


3.2.2.2 The (L)H-H% boundary-tone complex 


The (L)H-H is a summarizing label designating the two variants LH-H% and 
H-H%. The difference is that in the LH-H% variant the main domain for the pitch 
rise is the penultimate syllable. In the H-H% variant, on the other hand, the rise 
in pitch may extend over several syllables. The two variants are illustrated below. 

An instance of the LH-H% boundary-tone complex is found in example (7), for 
which the periogram is given in Figure 3.18. The initial 5 words form a separate 
IU, the final word of which is kakaita ‘your grandfather’. 

Pitch starts around the middle of the speaker’s current range and gradually 
drops about 3 st over the first 8 syllables. On the prefinal syllable (.kai.) of the 
final word kakaita ‘your grandfather’ of the first IU, pitch rises 5 st towards the 
high target of the LH- phrase tone at the beginning of the final syllable (.ta). 
Pitch then remains high over the final syllable of the IU as the IU ends on an H% 
boundary tone. 
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Figure 3.18: Periogram with Pitch track (in st) for example (7), example 
of LH-H% boundary-tone complex on kakaita, speaker SYNO 
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(7) anu maga ulai sei kakaita | ai bakeleta 


anu moga uli=ai sei kakai=ta ai 
REL only from=VEN HON grandfather=1PI.GEN and 
bakele=ta 


grandmother=1P1.GEN 


‘which (only) comes from our grandfathers and grandmothers’ 
(explanation-lelegesan SYNO.002) » 


The H-H% variant of the boundary-tone complex is exemplified in (8), with its 
corresponding periogram depicted in Figure 3.19. The first three words form a 
separate IU, the final word of which is kami ‘Ipe’. Here, the domain of the pitch 
rise to the high target of the H% boundary tone is not the prefinal syllable (ka.) 
exclusively, but extends over several syllables (mo.i.ta.ka.). 
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Figure 3.19: Periogram with pitch track (in st) for example (8), example 
of H-H% boundary-tone complex on kami, speaker ZBR 


(8) ana mita kami | dei lipulipu giigii ana 
ana mo-ita kami dei RDP-lipu RDP-gii ana 
if POT-see 1PE LOC RDP-country RDP-different MED 


“when we look at that in other countries' 


(explanation-wedding-tradition ZBR.258) > 


3.2.2.3 The (L)H-Lx boundary-tone complex 


Lastly, the (L)H-L% boundary-tone complex and its two realizations H-L% and 
LH-L% are exemplified. Similar to the above, in the LH-L% variant the main do- 
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main for the pitch rise is the penultimate syllable. In the H-L% variant, on the 
other hand, the rise in pitch extends over several syllables. The two variants are 
illustrated below. 

An instance of the final boundary-tone complex H-L% is displayed in the pe- 
riogram of example (9) in Figure 3.20. The initial word moane ‘man’ forms a 
separate IU, demarcated by the IU-final boundary-tone complex H-L%. 

Pitch starts around the middle of the speaker’s current range. Pitch then rises 
10 st over the first two syllables until it reaches the high target of the H- phrase 
tone located between the penultimate (.a.) and the ultimate syllable (.ne) of the 
IU. On the ultimate syllable, pitch drops 7 st to the low target of the IU-final L% 
boundary tone. 
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Figure 3.20: Periogram with pitch track (in st) for example (9), example 
of final boundary-tone complex H-L% on moane, speaker SP 


(9) moane | ana lau monuludan ata ana 
moane ana lau moN-sulud-an oto ana 
man MED presently Av-push-APPL car MED 


‘it is a man who is pushing that car’ (QUIS-focus_SP.010) > 


Example (9) is particularly interesting. In this example, the word moane ‘man’ 
is informationally important and therefore constructed as the first element of an 
cleft construction. The construction marks the focus on moane ‘man’, which is 
then followed by ana lau monuludan ota ana “this (one) is pushing the car”. The 
pitch contour on moane ‘man’ could potentially be interpreted as a prominence- 
lending pitch movement on a word that is in focus. However, in Chapter 2, and in 
particular in §2.2, I showed that Totoli does not make use of such means to mark 
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focus. Focus is expressed by means of a cleft construction in this case. The focus 
constituent is then prosodically uttered in its own IU with a H-L% boundary tone 
that signals finality. The prosodic realization of syntactic constructions is further 
discussed in §3.3. 

Finally, the LH-L% variant is illustrated by example (10) and its periogram in 
Figure 3.21. The first three words form a separate IU, the last word of which is 
poni ‘again’. 

Pitch begins around the middle and initially drops about 4 st. On the prefinal 
syllable (po.), pitch rises 9 st towards the high target of the H- phrase tone, located 
at the beginning of the penultimate syllable of the IU-final word pani. On the final 
syllable, pitch drops about 7 st. 
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Figure 3.21: Periogram with pitch track (in st) for example (10), LH-L% 
boundary-tone complex on pani, speaker SP 


(10) io kita poni | maanuai 
io kita poni mo-anu-ai 
OK 25 again AV-FILL=VEN 
‘ok, now you do one again’ (spacegames seguencel KSR-SP.252) > 


3.2.2.4 Distribution 


The distribution of the different final boundary-tone complexes of non-final, em- 
bedded IUs is shown in Figure 3.22. 

The distribution shows that the (L)H-H% and the L-H% are the major final 
tonal patterns of non-final, embedded IUs of CIUs. The LH-H% pattern is only 
marginally attested. The distribution is similar over the different data types. 
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Figure 3.22: Frequency distribution of tonal events at the right-edge 
boundary of embedded, non-final TUs, within conversational and 
monological recordings 


3.2.3 Phonetic variability 


The segmental material influences the realization of the pitch contours in that 
a lack of sonorant material in the final and prefinal syllable results in the pitch 
rises or falls to be only partly realized. 

For the purpose of illustration, I use examples of singleton [Us or final IUs 
of CIUs with final boundary tones (L)H-L% and L-H% respectively, as they are 
the major final boundary tones. The effects are the same for the final LH-H% 
boundary-tone complex and also for final boundary-tone complexes of embed- 
ded, non-final IUs of CIUs. 

Consider Figure 3.23 and Figure 3.24, which are the periograms of another 
example of a THL given in example (11). As is specified for all THL constructions, 
the first CIU bears the final (L)H-L% boundary-tone complex and the second one 
the L-H%. These are only partly realized, due to the voiceless plosives [k] and [t] 
in the onset of the final (ki.) and the prefinal syllable (.ta#). 

In Figure 3.23, the rise in pitch on the penultimate syllable of the CIU (ki.) is 
interrupted and is only visible on the short vowel of the penultimate syllable. 
The drop in pitch of about 11 st of towards the low boundary tone L% is realized 
in full on the final syllable. In Figure 3.24, the rise to the high target of the H% 
boundary tone is interrupted and hence results in a jump in pitch of about 6 st 
on ki.ta#. 


(11) a. anu ampi koloanan | saasaluai kita 
anu ampi koloanan RDP-salu=ai kita 
REL side right RDP-to.face=VEN 25 


‘So the (one) on the right-hand side is facing you’ 
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Pitch in st (re 0.51 Hz) 
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100+ Ki ALAN 


anu ampi koloanan saasaluai kita 
a A S bei? fei KM E Al 


Figure 3.23: Periogram with pitch track (in st) for example (11a) with 
CIU-final boundary-tone complex LH-L%, speaker SP 
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Figure 3.24: Periogram with pitch track (in st) for example (11b) with 
CIU-final boundary-tone complex L-H%, speaker SP 


3.2 Intonational model 


b. ampi kaloanan | saasaluai kita 
ampi koloanan RDP-salu=ai kita 
side right RDP-to.face=VEN 2s 
‘the (one) on the right-hand side is facing you’ 
(spacegames seguence4 KSR-SP.231 & 233) > 
If the segments of the final and the prefinal syllable are fully sonorant, the 


boundary-tone complexes are fully realized, as can be seen in the two realizations 
of example (12) in Figure 3.25 and Figure 3.26. 
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Figure 3.25: Periogram with pitch track (in st) for example (12) with 
IU-final boundary-tone complex L-H%, speaker SP 


(12) molitengean 
moli-tenge-an 
RCP-back-RCP 
‘(they are) back to back” (spacegames seguence4 KSR-SP.071 8 105) >, > 


For final syllables with a short vowel, the main domain of the rise or fall to the 
T phrase tone is the penultimate syllable. Segmental material affects the shape 
of the tonal contours but not the location of tonal targets. However, if the IU-final 
syllable involves a long vowel, the tonal targets of both the phrase tone and the 
boundary tone are realized in full on that syllable. 

For an illustration of the LH-L% and the L-H% on a final syllable with a long 
vowel, see examples (13) and (14), which both end on the word pomos ‘back then/- 
first’. Two different realizations are given in Figure 3.27 and Figure 3.28. 

The pitch contour of example (13) in Figure 3.27 shows that the rise to the high 
target of the LH- phrase tone and the subsequent fall to the low target of the L% 
boundary tone are both realized on the ultimate syllable (maat, 
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Figure 3.26: Periogram with pitch track (in st) for example (12) with 
IU-final boundary-tone complex LH-L%, speaker SP 
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Figure 3.27: Periogram with pitch track (in st) for example (13) with 
IU-final boundary-tone complex LH-L% on a final syllable with a long 
vowel, speaker SYNO 
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(13) ramean tau pomo 
rame-an tau psm 
lively-NMLZ person back.then 
‘The crowd/amusement of the people back then’ 


(explanation-lelegesan_SYNO.085) > 


Similarly, the realization of example (14) in Figure 3.28 shows that the drop 
in pitch towards the low target of the L- phrase tone and the subsequent rise 
towards the high target of the H% boundary tone are also both realized on the 
final long syllable (maat, 
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Figure 3.28: Periogram with pitch track (in st) for example (14) with 
CIU-final boundary-tone complex L-H% on a final syllable with a long 
vowel, speaker RD 


(14)  geipo | sallo pomo 
geip=po9 sallo pomo 
neg=INCPL basket first 
“no, but the basket first' (pearstory_13_RD.015) > 


For the purpose of exemplification, I discussed the different boundary-tone 
complexes on rather short CIUs above. However, many CIUs contain several em- 
bedded IUs. Therefore, I discuss two examples of such rather long CIUs. Example 
(15) consists of a long CIU that begins with an embedded IU spanning 5 words 
and is demarcated by the final boundary-tone complex L-H% realized on the fi- 
nal word itu “DIST”. The rise in pitch of about 5 st to the high target of the H% 
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boundary tone is realized merely as a jump, due to voiceless plosive [t] in the 
onset of the final syllable (.tu). Pitch then drops 8 st to the low target of the L-H% 
boundary-tone complex of the following IU containing the word sapeda ‘bicycle’. 
Pitch then drops towards the low target of the IU-final boundary-tone complex 
L-H% of the final IU in the CIU. The final rise on the last syllable (.mpak) is again 
realized only as a jump in pitch due to the voiceless syllable onset. 
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Figure 3.29: Periogram with pitch track (in st) for example (15) with 
CIU-final boundary-tone complex L-H%, speaker SP 


(15) kaasikan magiigitai manana dolago itu | sapeda | naollumpak 
keasikan mog-RDP-ita-i manana dolago itu sapeda 
excitement AV.NRLS-RDP-watch-APPL child girl ` bist bicycle 
no-RDP-lumpak 
AV.RLS-RDP-hit.against 
‘because of his excitement in looking at the girl, his bicycle crashed 
(against the stone)’ (pearstory 11 SP.025) » 


Consider example (16) and its visualization in Figure 3.30. The CIU consists 
of several embedded IUs. The first two bear the final boundary-tone complexes 
LH-H%, realized on siritana ‘this story’ and daan ‘Exist’. The word maaling “to 
get lost’ bears the [U-final LH-L% boundary-tone complex. The final IU of the 
CIU ends on an L-H% boundary-tone complex. 
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Figure 3.30: Periogram with pitch track (in st) for example (16) with 
IU-final boundary-tone complex L-H%, speaker SYNO 


(16)  siritana | ia geimo daan | lau makadaang maalin | ia baran ia 
sirita=na ia geimodaan lau modko-doon mo-alin ia 
story=3S.GEN PRX not EXIST presently sT.Av-want sT-disappear PRX 
baran ia 
goods PRX 


“This story will never again get lost, this thing” 


(explanation-lelegesan SYNO.007) > 


The guestion is whether the tonal contours can be explained as involving IU- 
final H% boundary tones only, rather than the combination of a LH- or L- phrase 
tone with an H% boundary tone. Such an analysis would not capture the fact that 
the pitch rises occur on one syllable only. Analyzing the IU-final rise as an H% 
boundary tone alone would not explain why the pitch contour does not remain 
high after the initial high target towards the end of the first embedded IU and 
the high targets of the high boundary tones of the following IUs. 

In considering example (16) and its visualization in Figure 3.30, it becomes 
evident that we must assume an L- or LH- phrase tone to account for the drop 
in pitch before the final rises on the ultimate syllables. The alternative would be 
to assume that embedded IUs begin with an IU-initial low tone. This, however, 
would not explain why pitch drops steadily over the entire IU towards the final 
or prefinal syllable. 

To conclude the discussion of phrase-final pitch events, a note on discourse 
particles is due. 
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3.2.4 Discourse particles 


In addition to one of the boundary-tone complexes, a discourse particle can op- 
tionally occur at the end of a singleton IU or CIU. The two prosodic clitics wi and 
ee are the most frequently attested in the corpus. Other discourse particles are 
not frequent enough to allow for any generalizations. The discourse particles wi 
and ee are uttered under a coherent pitch contour together with the host CIU — 
either singleton IU or a CIU — and no pause occurs between them. Impression- 
istically, most CIUs sound complete if the ‘prosodic clitic’ is cut off using any 
annotation software. These encliticized discourse markers are tonally specified 
as either rising or falling, independent of the boundary-tone complex of the host 
CIU. They are tonally specified for either H%, to signal continuation or L%, to 
signal finality. 

A frequently occurring discourse particle in the corpus is wi. Similar to Indone- 
sian kan, the Totoli discourse marker wi is used as “a request of verification or 
confirmation, or it may be a marker of conjoint knowledge” (Wouk 1998: 403). In 
the corpus, this discourse particle frequently occurs in recordings of the Space 
Game task (Levinson et al. 1992). In this task, two participants are each given an 
identical set of photos and must find matching photos in a memory game. As one 
participant has to identify the photo being described by the second participant 
without seeing the latter’s stack of photos, the consultants frequently ask for 
verification or confirmation of whether the photo selected indeed matches the 
intended image. In these contexts, the discourse marker wi is used. It is tonally 
specified for H%. 

Figure 3.31 shows the realization of example (17), with wi in CIU-final position. 


(17) dello engaengat | anu dei ulin | wi 
dello Rpp-engat anu dei ulin wi 
like RDp-lift.up REL Loc back INTJ 
‘like being lifted up, the one at the back, right?’ 


(spacegames seguence3 KSR-SP.225) > 


The L-H% boundary-tone complex is realized on the word ulin ‘back’. After a 
high target on the last syllable engaengat ‘being lifted’ of the first IU, pitch drops 
towards the low target of the L- phrase tone located at the boundary between 
the penultimate and ultimate syllables of the IU-final word ulin ‘back’. On the 
final syllable, pitch rises 6 st towards the high target of the H% boundary-tone 
complex. The discourse marker wi occurs after the rise of the L-H% boundary- 
tone complex, extending it by another 15 st. 
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Figure 3.31: Periogram with pitch track (in st) for example (17) with CIU- 
final boundary-tone complex L-H%, followed by the discourse particle 
wi with H% tone, speaker SP 


Taken from the same recording, example (18) is an instance of the same proso- 
dic clitic realized after an IU-final LH-L% boundary-tone complex, as shown in 
Figure 3.32. 
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Figure 3.32: Periogram with pitch track (in st) for example (18) with 
final boundary-tone complex L-H%, followed by the discourse particle 
wi with H% tone, speaker SP 


(18) molitengean | wi 
moli-tenge-an wi 
RCP-back-RCP INTJ 
‘(they are) back to back, right?’ (spacegames_sequence4_KSR-SP.124) > 
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Figure 3.32 shows the pitch rising 6 st towards the high target of the phrase 
tone, located at the beginning of the syllable .an. The subsequent final major 
drop in pitch is only partially realized, as the rise to the H% tone of the discourse 
particle extends into the coda of the preceding syllable. 

Note the two examples with a slight dip on the final discourse particle wi in 
Figure 3.31 and even more pronounced so in Figure 3.32. This is parallel to the 
realization of the L-H% and the (L)H-H% pattern described above and appears to 
be a characteristic of the rising patterns in Totoli (see Dombrowski & Niebuhr 
2010 for a discussion on convex vs. concave rising patterns). 

Another frequently occurring final discourse marker is ce. The discourse marker 
is prosodically realized as a clitic, tonally specified for L%. It is used as an empha- 
sizer, asserting the validity of the question or, as in example (19), reaffirming 
the correctness of the statement of the host CIU. Often it is a request for action. 
In example (19), the speaker urges the interlocutor to find the intended photo. 
Figure 3.33 shows the pitch contour of example (19). 
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Figure 3.33: Periogram with pitch track (in st) for example (19) with 
final boundary-tone complex L-H%, followed by a discourse particle, 
speaker KSR 


(19) ia | molitengean | ce 
ia moli-tenge-an ee 
yes RCP-back-RCP INTJ 
“yes, (they are) back to back!” (spacegames_sequence4_KSR-SP.278) > 


The CIU ends on the L-H% boundary-tone complex followed by the L% bound- 
ary tone of the prosodic clitic. The boundary-tone complex is realized on the final 
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two syllables of malitengean, to which the prosodic clitic is added. On the vowel 
of the syllable .nge., the low target of the L- phrase tone is located somewhat 
earlier than in contexts without a prosodic clitic. Pitch then rises 5 st towards 
the high target of the H% boundary tone, reaching its peak at the boundary of 
the syllable .an and the following prosodic clitic. Pitch then drops 9 st towards 
the low target of the L% boundary tone of the prosodic clitic. The combination of 
an (L)H-L% boundary-tone complex followed by a prosodic clitic specified for L% 
is realized either as a sustained pitch plateau following the L% of the boundary- 
tone complex, or is integrated into the major final drop in pitch. This can be seen 
in Figure 3.34, which depicts the pitch contour of example (20). The major final 
fall to the L% of the H-L% boundary-tone complex is realized on the last syllable 
of maliulunan ‘being in a row’. The prosodic clitic is then added, resulting in a 
further fall at the bottom of the speaker’s range. 
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Figure 3.34: Periogram with pitch track (in st) for example (20) with 


final boundary-tone complex H-L%, followed by a discourse particle, 
speaker KSR 


(20) moliulunan | ee 
moli-ulun-an ee 
RCP-row-RCP INTJ 


‘(they are) in a row!’ (spacegames_sequence3_KSR-SP.017) > 


If the preceding syllable ends on a vowel, the final prosodic clitic tends to be 
realized as part of the final major fall in pitch, as shown in Figure 3.35 from 
example (21). 
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Figure 3.35: Periogram with pitch track (in st) for example (21) with 


final boundary-tone complex H-L% and a following discourse particle, 
speaker KSR 


(21) ia poniga | ce 
ia poni=ga ee 
PRX still=? INTJ 
‘there is this one still!’ (spacegames seguence3 KSR-SP.136) » 


3.2.5 Discussion 


In this section, I developed a model of the intonation of Totoli ($3.2) on the ba- 
sis of a corpus of (semi-)spontaneous speech (83.2.2-83.2.4) and informed by in- 
sights from the experiments described in Chapter 2. 

In Totoli, a high proportion of singleton IUs are observed, where the final 
boundary-tone complex is the only pitch event. However, CIUs are also common. 
Table 3.36 shows the average number of IUs contained in a substantive CIU. The 
bins represent the number of IUs contained in a segmented CIU of the corpus, 
and the height of the bins represents their overall proportion. The proportion 
and absolute numbers are stated above the bins. 

The distribution is skewed, and less than 10x of CIUs contain four or more 
embedded IUs. In the entire corpus, 43.4% of IUs are singletons, with 41.3% in 
monological data and 47.0% in conversational data. CIUs consisting of two em- 
bedded IUs occur at a proportion of 35.3% in the entire corpus, with 36.1% in 
conversational data and 34.8% in monological data. 

In sections §3.2.1-§3.2.4, I analyzed pitch events at the right-edge boundaries 
of singleton IUs, non-final embedded IUs of CIUs, and final IUs of CIUs. I pro- 
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Figure 3.36: Frequency distribution of substantive CIUs containing n 
embedded IUs, 1 equals a singleton IU. 


posed a classification of tonal events, which includes three boundary-tone com- 
plexes of each type. Table 3.4 provides a summary of the different tonal com- 


plexes. 
Table 3.4: Summary of proposed IU-final boundary-tone complexes 
IU-final IU-final 
boundary-tone complexes boundary-tone complexes 


of non-final IUs of CIUs of singleton IUs or final IUs of CIUs. 


(L)H-H% LH-H% 
L-H% L-H% 
(L)H-L% (L)H-L% 


Table 3.4 presents the different boundary-tone complexes, arranged such that 
those with similar tunes are in the same row. The only difference is between 
the (L)H-H% and the LH-H% boundary-tone complexes. I argue that in CIU-final 
position, the domain for the pitch rise is exclusively the penultimate syllable, 
expressed here by an LH- phrase tone. In final position of embedded, non-final 
IUs of CIUs, however, there is more variation with regard to the domain of the 
pitch rise, expressed here by the label (L)H-. 

So far, the main difference between the final boundary-tone complex of em- 
bedded, non-final IUs of CIUs and singleton IUs or final IUs of CIUs is that, by 
definition, the latter co-occurs with other boundary phenomena which do not oc- 
cur at the end of embedded, non-final TUs of CIUs or may not be as pronounced. 
For instance, this could be a reset in pitch and an interruption in the pitch contour, 
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a following pause, final syllable lengthening, and vocal fry (cf. Schuetze-Coburn 
1994: 93-155, Himmelmann 2006: 260-270, Du Bois et al. 1992, 1993, and Crut- 
tenden 1997: 29-39). In the exposition above, I focused mainly on a discussion of 
pitch events. Further investigation of boundary strength may reveal interesting 
insights into the interplay of boundary type and boundary phenomena in Totoli 
(Schwiertz 2009, Cho 2005, Fougeron & Keating 1997). 

Regarding the tonal patterns, the main difference pertains to the distribution 
of the types of boundary-tone complexes. While the (L)H-H % pattern is the main 
pattern at the right-edge boundary of non-final IUs of CIUs, it is the minor pat- 
tern in CIU-final position, i.e. of singleton [Us and final IUs of CIUs. Conversely, 
(L)H-L % is the main pattern occurring at the right-edge of CIUs but the minor 
pattern occurring at the right-edge of embedded, non-final IUs of CIUs. In both 
positions, the L-H  boundary-tone complexes occur in comparable proportions. 
This is displayed in Figure 3.37. 


D aD 
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distribution of occurences in % 
N 
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Figure 3.37: Frequency distribution of tonal events at the right-edge 
boundary of embedded IUs and non-final IUs of CIUs (e. IUs) in blue 
and singleton IUs or final IUs of CIUs in yellow 


Considering the similarities between tonal events demarcating the right-edge 
boundaries of CIUs and non-final IUs of CIUs, the question arises as to how one 
can explain the differences in distribution. 

In section §3.2.1.5 I described the different patterns as expressing varying de- 
grees of finality or continuation. The LH-H% pattern expresses a high degree 
of non-finality or continuation, the L-H% pattern expresses a medium degree of 
continuation or non-finality and the LH-L% pattern expresses finality. Taking 
this as the main function of the patterns explains why the LH-H% pattern is the 
most frequent pattern for embedded, non-final IUs of CIUs. In a chunk of speech, 
here the CIU, speakers phrase various grammatical units into separate prosodic 
units, for which the non-finality-signaling pattern, the LH-H%, is used to signal 
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full integration. CIU internally, the LH-L% pattern is infrequent and only used in 
non-canonical constructions such as cleft constructions to signal less integration. 
The opposite holds true for singleton IUs and final IUs of CIUs. Here, the non- 
finality-signaling pattern LH-H% is used very infrequently, and in fact almost 
exclusively in lists, to signal non-finality. The finality-signaling pattern LH-L%, 
however, is very frequent. In both instances, that is CIU internally or CIU final, 
the L-H% pattern, signaling medium finality/non-finality is equally frequent. 

The intonational model of the CIU in Totoli is based mainly on the inspection 
of tonal events. I showed that, internally in CIUs as well as in final position of 
CIUs, the same tonal patterns occur. This led me to postulate a recursively em- 
bedded structure of a Compound IU, rather than a hierarchical structure where 
higher-level units (i.e., the IU) consist of lower-level units (i.e., Accentual Phrases 
or intermediate phrases) and where higher-level units are tonally specified dif- 
ferently than lower-level prosodic units (see e.g., Jun & Fougeron 2000 for such 
an analysis of French and Ipek & Jun 2013 for Turkish). 

Himmelmann’s (2018) model of an IP in languages of Western Austronesia 
describes IU-internal tonal events as boundary-marking devices of smaller units, 
called the intermediate phrase (ip). His model proposes that the right-edge bound- 
ary of an IU consists of a phrase tone and a boundary tone and that ip-final 
boundary-tone complexes consist of a high target only. The model is reprinted 
in Figure 3.38. 


L$ H$ L$ H$ T-T% 


a oe O 


[loocoo];, [soo000];, 5000] 1p 


Figure 3.38: Himmelmann’s (2018: 360) model of the IP in Western Aus- 
tronesian languages: T$ representing an ipboundary tone, T% an IP 
boundary tone, and T- and IP phrase accent. 


Before discussing the theoretical implications of such an analysis, it is worth 
examining two further illustrative examples of Tail-Head Linkage (THL) con- 
structions (see §3.2.1.5 for an explanation of THLs). These make an interesting 
case in point. The Tail IU is repeated in part or in full and serves as the Head 
IU of the subsequent paragraph. The Tail always bears the IU-final boundary- 
tone complex(L)H-L% and the Head always has the L-H% boundary-tone com- 
plex. Two examples are given below in (22) and (23). Consider first (22) and its 
periogram in Figure 3.39. The first IU is repeated after a long pause and the ad- 
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verb indan ‘after’ is added, pointing to the adverbial status the Head of a THL 
construction holds for the subsequent paragraph. 


_ 110+ 
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Figure 3.39: Periogram with pitch track (in st) for example (22); speaker 
SUD 


(22) a. <na> nodulu isia 
no-dulu isia 
AV.RLS-help 3s 
‘helping him’ 
b. ah indan nodulu isia 
ah indanno-dulu  isia 
INTJ after av.RLs-help 3s 
‘after helping him’ (pearstory_38_SUD.056) > 


The final boundary-tone complexes and the subsequent pauses mean that the 
status of examples (22a) and (22b) as separate CIUs is unambiguous. However, in 
some instances, the Head is directly followed by further syntactic material, as in 
the THL in example (23), for which the periogram given in Figure 3.40. 


(23) a. manana umbasan deden 
manana umbasan deden 
child young.man small 


‘(there was) a young boy’ 
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Figure 3.40: Periogram with pitch track (in st) for example (23), speaker 
AT 


b. ma <um> umbasan deden | mai nagala anu ia 
manana umbasan dedeyn mai nog-ala anu ia 
child young.man small come Av.RLs-fetch FILL PRX 
‘(there was) the boy; he comes to take the thingy’ 


(pearstory 23 AT.039-40) > 


As expected, the Head manana umbasan deden ‘young boy’ bears the final 
boundary-tone complex LH-L%, as it is the standard pattern for Tails of THLs. 
The difference pertains to the Heads of the two THLs in example (22b) and ex- 
ample (23a). In (23b), the Head — here in near exact repetition — is followed by 
further syntactic material, without any pause, pitch reset, or other boundary phe- 
nomena. In this case, the Head would have to be analyzed as the first ip of the 
IP, if one applies Himmelmann’s (2018) model. 


IP IP 


| | 
ip ip 
| | 


[[<na> nodulu isia]] Tab indgan nodulu isia]] 


Figure 3.41: Prosodic organization of examples (22a)-(22b), according 
to Himmelmann’s 2018 model 


However, the tonal events at the right edge of the Heads (22b) and (23b) are the 
same in both of the THL constructions. In example (22), we would label it L-H% 
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IP 


| 
ip 
| 


[[magana umbasan deden]] 


IP 


ip ip 
| | 


[[ma <um> umbasan deden] [mai nagala anu ia]] 


Figure 3.42: Prosodic organization of examples (23a)- (23b), according 
to Himmelmann’s 2018 model 


as it occurs in IP-final position. In the second THL construction in example (23), 
we would have to label the final tonal pattern L-H$, as it occurs at the edge of 
an ip, integrated into an IP ($ indicates an ip boundary tone in Himmelmann’s 
2018 model). However, the model assumes that (non-final) ips and IPs are tonally 
differentiated. We could do away with this seeming contradiction by assuming 
no further intonational level between the phonological word and the IU/IP, and 
by describing IUs such as those in example (23b) as recursively parsed into IUs: 


IU CIU 


ARA Ee 


IU IU 


[manana umbasan deden] [ma <um> umbasan deden] [mai nagala anu ia] 
Figure 3.43: Alternative prosodic organization of example (23) 


One reason for opposing the alternative analysis in Figure 3.43 is the Strict 
Layer Hypothesis (SLH; Selkirk 1986, Nespor & Vogel 1983, 1986, Vogel 2019), 
which predicts that any prosodic structure consists exhaustively of units of the 
next level down in the prosodic hierarchy, and allows no recursivity. 

Though widely applied, the SLH causes empirical problems (see the discussion 
in Ladd 2008: chapter 8.2). With evidence from tone sandhi in Xiamen Chinese, 
Chen (1987) shows that tone groups and IUs regularly intersect and hence violate 
the SLH in that a tone group may be associated with two IUs. On the issue of 
overlapping domains in Luganda, Hyman et al. comment: 
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The alternative in Luganda which we consider in work in progress is that 
the SLH and some of the claims of its advocates must be significantly weak- 
ened to allow cyclic assignments of postlexical domains. (1987: 107) 


The model proposed by Himmelmann (2018: 369) analyzes IU-internal tonal 
events as boundary tones of ips and leaves open the status of IU-final material 
that follows the last ip-final boundary, see Figure 3.38. 

One possibility is to analyze this as constituting an ip as well. Himmelmann ar- 
gues that the Strict Layer Hypothesis (SLH; Selkirk 1986) would demand such an 
analysis, but points out that tonal targets are too different (his model assumes 
simple boundary tones at the right-hand edge of ips and boundary-tone com- 
plexes at the end of IUs) and that one would have to assume that the tune of 
IU-final ips are deleted or overwritten by the IU-final boundary tones. Himmel- 
mann (2018) notes that IU-final boundary-tone complexes are of a different type 
and do not include ip-level tones, which in itself is again a violation of the SLH. 
However, I showed that final boundary tones are very similar, if not identical; 
hence, no overwriting rule would have to be postulated. 

Abolishing the notion of an intermediate phrase level altogether and assuming 
recursive parsing of IUs into CIUs, we can avoid the difficulties in explaining that 
tunes are essentially the same except for the presence or absence of IU boundary 
phenomena. In THL constructions, one would avoid having to use different labels 
for essentially the same tonal pattern. 

Also evident from the examples above are the obvious differences to the Ac- 
centual Phrase (AP), the postulated prosodic unit below the IU in many of the 
prosodic descriptions in the two volumes edited by Jun (2005c, 2014a). Not only 
does the AP differ from the IU in its tonal marking but the “AP has been typically 
defined as a tonally marked prosodic unit which contains one word” (Jun 2014b: 
532). Example (15) in Figure 3.29 above is a particularly instructive instance of 
an adverbial phrase realized as embedded IU that spans 5 words / 15 syllables 
and has a near-flat level contour except for the right-edge boundary-tone com- 
plex (kaasikan mogiigitai manana dolago itu ‘because of his excitement in looking 
at the girl ..”). The example adverbial phrase is uttered as one prosodic phrase 
and clearly larger than the AP and the prosodic word. Therefore, we cannot do 
away with recursive embedding of IUs by assuming simply the level of the non- 
recursively embedded IU and the phonological word or the AP or both. 

The prosodic organization in Figure 3.43 equates to the model proposed by 
Ladd (2008: 297) which he calls the Compound Prosodic Domain (CPD): “A CPD 
is a prosodic domain of a given type X whose immediate constituents are them- 
selves of type X? 
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X 


Pa 


A A 


Figure 3.44: Exemplification of Compound Prosodic Domains, repro- 
duced from Ladd (2008: 297) 


In a more recent account by Selkirk (2011) termed Match Theory, recursivity is 
permitted and attributed to syntactic constituency-respecting Match Constraints. 

Evidence obtained from an inspection of the IUs as they occur in the corpus of 
Totoli leave no doubt that tonal events at the edges of prosodic units are essen- 
tially the same. Therefor, I argue that complex IUs are best be described as CIUs 
that consist of a string of embedded IUs. 

I started by assuming that speech is chunked into prosodic units, which are 
demarcated by a set of boundary phenomena and which is perceived as such by 
listeners (§1.4 and §3.1). I then described and categorized the tonal events at the 
end of such units (§3.2.1). In a further step, I looked at tonal events within such 
units and found that they are essentially the same as those that occur at the end 
(§3.2.2). I concluded that they are also right-edge boundary tones of prosodic 
units which regularly match syntactic units. Based on the observation that tonal 
events of all kinds of prosodic units are essentially the same, I argue for assuming 
recursive embedding of IUs into Compound IUs. The results here show that tonal 
contours are engaged at the level of the IU but not the CIU. It is crucial to point 
to the fact that the argument for recursion is only based on the tonal realization 
of prosodic units alone. 

Further evidence for recursion come from syntax, as briefly shown above. In 
section §3.3, I discuss this aspect more thoroughly by comparing the syntactic 
content of singleton IUs with that of embedded, non-final IUs of CIUs. 


3.3 Intonation Units and grammatical units in Totoli 


In the previous Chapter 3.2, I presented an in-depth analysis of the tonal patterns 
of prosodic units in Totoli. In this section, I will investigate the syntactic content 
of prosodic units in the Totoli corpus (see §3.1.1) and analyze the grammatical 
units they typically contain. Specifically, I will first investigate which structures 
are usually found in CIUs, whether they are singleton IUs or Compound IUs. Sec- 
ondly, I will investigate the syntactic structures embedded in CIUs and compare 
them to those found in singleton IUs. 
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In this present work, I adopt a discourse-oriented approach based on a corpus 
of natural, (semi-)spontaneous, and unscripted speech. Working with such data 
highlights the flexibility in the syntactic content of prosodic units. The question 
arises of the type of syntactic content that can exist within a prosodic unit and 
whether there are any regularities in the relationship between them. 

A confounding factor pertains to the concept of CIUs, which I have introduced 
in this study, referring to either singleton IUs or CIUs, as distinct from embedded 
IUs of CIUs. In my analysis, the term CIU denotes those prosodic units that are 
demarcated by typical boundary cues such as pitch reset and/or pause, as well as 
other criteria mainly related to pitch, rhythm, and voice quality, as mentioned in 
§3.1 above. Therefore, CIU is equivalent to IU as reported in the literature below, 
as compound intonational units are not posited for these languages. 

According to Ladd (2008: 288), explicit phonetic definitions are necessary for 
determining the criteria of IU and prosodic domain types in general. One of the 
confounding factors he identifies in the segmentation of spontaneous speech is 
the presumption that the division of syntactic units into prosodic ones reflects 
syntactic criteria, with many assuming that: 


[...] the various prosodic domains are defined by descriptions of how syn- 
tactic structure is mapped onto prosodic structure. (Ladd 2008: 289) 


One of the significant achievements in prosodic phonology was the realization 
that prosodic boundaries systematically differ from syntactic boundaries. This 
was famously discussed by Chomsky & Halle (1968), who provided the frequently 
cited example of right-branching relative clauses. The syntactic boundaries are 
reprinted in (24a) and the prosodic boundaries in (24b). 


(24) a. This is [the cat that caught [the rat that stole [the cheese]]] 
b. this is the cat - that caught the rat - that stole the cheese 
(Chomsky & Halle 1968: 372) 


Such systematic misalignment of syntactic and prosodic boundaries is usu- 
ally interpreted as the result of mapping a complex syntactic structure onto an 
“intuitively ‘flatter’ or ‘shallower’ prosodic structure” (Ladd 2008: 290). Within 
Prosodic Phonology (Nespor & Vogel 1983, Selkirk 1986, Nespor & Vogel 1986), 
mapping constraints which describe the relation between syntactic and prosodic 
units were formalized. As Fery (2016: 62-63) puts it: 


The basic idea of all models accounting for the syntax-prosody mapping is 
that the syntactic component is submitted to an algorithm - a set of rules 
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or constraints — the aim of which is to map a prosodic structure to it. Theo- 
retical issues relate to the way this correspondence is formulated as well as 
to the resulting prosodic constituency. 


In recent years, new alignment constraints have been proposed, including 
Wrap Theory (Truckenbrodt 1999) and Match Theory (Selkirk 2011). The underly- 
ing assumption of such theories is that syntactic constituents correspond to pro- 
sodic units. In Match Theory, this assumption is expressed by the Match Clause, 
Match Phrase and Match Word constraints (Selkirk 2011: 5): 


i. Match Clause: A clause in syntactic constituent structure must be matched by 
a corresponding prosodic constituent, call it ı [intonational phrase], in 
phonological representation. 


ii. Match Phrase: A phrase in syntactic constituent structure must be matched 
by a corresponding prosodic constituent, call it $ [phonological phrase], 
in phonological representation. 


iii. Match Word: A word in syntactic constituent structure must be matched by 
a corresponding prosodic constituent, call it œ [phonological word], in 
phonological representation. 


Ladd (2008: 289) comments on these accounts: 


In my view, it makes no sense to treat accounts like Nespor and Vogel’s 
or Selkirk’s as definitions; rather, they are hypotheses, predictions about 
the correspondence between one type of independently definable structure 
and another. [...] Unless the syntactic and the phonological structures are 
defined in their own terms, the whole exercise becomes purely circular. 


Focusing on natural data, works by Iwasaki & Tao (1993), Schuetze-Coburn 
(1994), Croft (1995), Tao (1996), Iwasaki (1996) and more recently Croft (2007), 
Park (2002), Matsumoto (2003) and Wouk (2008) have provided detailed descrip- 
tions of the syntactic content of IUs as they are found in corpora of spontaneous 
speech from a variety of typologically unrelated languages. These accounts have 
shown the flexibility of the syntactic content of [Us but have also revealed some 
regularities. 

One tendency found in these studies is that approximately 50% of all IUs ina 
corpus consist of a simple clause, e.g. 47.8% in English (Croft 1995: 849), 50.5% in 
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Wardaman (Croft 2007: 12), 47.9% in Mandarin Chinese (Tao 1996: 72), and 51.7% 
in Sasak (Wouk 2008: 150). 

Moreover, there seems to be a considerable number of [Us that consist of a 
single NP (referred to as ‘lone NP’ by Croft 2007: 12); for instance, 13.7% in English 
(Croft 1995: 849), 21.1% in Wardaman (Croft 2007: 11), 25.9% in Mandarin Chinese 
(Tao 1996: 72), and 21.0% in Sasak (Wouk 2008: 150). 

However, genre appears to have a substantial influence on the proportions 
of IU types. This must be taken into account when comparing the results from 
different studies, as they vary with regard to the types of data used. 

Tao (1996), Wouk (2008), Matsumoto (2000), Schuetze-Coburn (1994) and Park 
(2002) use conversations between two or more participants, whereas Croft (1995, 
2007) bases his analysis on monological data. The cross-linguistic comparison 
in Croft (2007: 12) conflates both data types. Another reason why these results 
should be approached with caution is that different coding conventions have 
been applied. In two different studies on Japanese, the difference in coding con- 
ventions leads to an 11.6 percentage point difference in the proportion of clausal 
IUs in Japanese (57.0% in Matsumoto 2000: 58 and 45.4% in Iwasaki & Tao 1993: 3). 
These factors have to be taken into consideration when comparing the results on 
the reported data (see also the comments in Park 2002: 642, Croft 2007: 12, and 
Wouk 2008: 139-144). 

Despite the differences, these studies have provided cross-linguistic evidence 
which confirms the centrality of the (simple) clause and the lone NP with regard 
to grammatical structures typically found in IUs. The sentence, on the other hand, 
appears to be rather difficult to identify in spoken speech: 


This is not a problem found with most other grammatical units, such as 
the clause or the phrase, which are generally clearly identifiable in spoken 
language. Yet the sentence is generally taken to be the basic unit of syntac- 
tic analysis. On the other hand, a sentence cannot be equated with an IU, 
the spoken-language analyst’s most popular unit of choice for analysis. An 
IU does not grammatically correspond to a sentence, since it frequently is 
a unit smaller than a sentence and sometimes (though quite rarely) is not 
a full grammatical constituent at all. (Croft 1995: 841) 


As described above, the majority of IUs contain a clause or a phrase. Other 
types include IUs consisting of a single connective or an interjection. Croft (2007: 
11) specifically argues that these should similarly be thought of as constituting 
independent grammatical units. He calls these “lexical IUs”. Another central ob- 
servation is that the number of broken or interrupted IUs, such as (uncorrected) 
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false starts, disjointed [Us and fragmentary IUs, is very small (2% in the corpus 
on Wardaman, see Croft 2007: 11). 

In sum, speakers produce short stretches of speech which are rarely broken 
or fragmented and which mostly contain a full grammatical unit. Based on these 
observations, Croft (1995: 845) proposes the full Grammatical Units condition: 


The overwhelming preference for IUs to be in the form of full GUs [gram- 
matical units], other things being equal, will be called the full GU condition. 


In the same article, Croft (1995: 872) offers a possible explanation to account 
for both (a) the small number of broken [Us and (b) the high number of rather 
short and syntactically simple IUs. Croft calls it the JU storage hypothesis. 

Intonation Units are explained as cognitive units and are considered “linguistic 
expressions of focuses of consciousness, whose properties apparently belong to 
our built-in information-processing capabilities” (Chafe 1980: 48). As there is no 
inherent constraint on the size of IUs per se, there must be some sort of cognitive 
limitation. Croft’s (1995: 873) IU storage hypothesis suggests that Intonation Units 
consist of grammatical units that are stored or precompiled in the memory of 
the speaker. He argues that this accounts for the overwhelming frequency of IUs 
consisting of a single clause or single NP. More complex structures need to be 
computed based on the precompiled or stored grammatical units, which is why 
complex structures, such as multiply embedded NPs, rarely occur in spontaneous 
speech and are usually broken across several IUs. Croft (1995: 873) explains this 
in terms of the cognitive limitations of humans: 


Stored/precompiled constructions — and IUs themselves - may be the man- 
ifestation of the limitations of short-term memory in processing. The IU 
storage hypothesis suggests that grammatical structure and organization 
have evolved to conform to the limitations as well as the capacities of the 
human mind, specifically those embodied in IU structure. 


Chafe (1994: 108) offers another cognitively grounded explanation to account 
for the types of structures typically found in IUs, the One New Idea Constraint: 


Conversational language appears subject to a constraint that limits an in- 
tonation unit to the expression of no more than one new idea. 


Chafe argues that speakers can only activate one concept at a time and the IU 
is the basic unit used to express this cognitive process. Ifa simple clause with one 
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predicate and one argument is the typical exponent of an IU, then only one of the 
two elements expresses a new idea. Chafe acknowledges that counterexamples 
from spontaneous speech are plentiful and offers a variety of explanations for 
such structures. Himmelmann et al. (2018) offer a way to measure the informa- 
tion content of an IU by computing the average number of content words an IU 
typically contains. Their study on four typologically unrelated languages found 
an average of 1.6-1.8 content words per IU (see also the discussion on various 
ways of measuring the length of IUs in §3.1.3). 

The One New Idea Constraint is limited to those IUs that Chafe (1994: 63) refers 
to as substantive IUs, that is, those which express ideas of events, states or refer- 
ents. The other major IU type is regulatory IUs, which regulate interaction and 
information flow. A third and minor IU type is fragmentary IUs. 

On that matter, Chafe (1994: 119) comments: 


In any case, the finding that people can activate only one new idea at a 
time, as well as the insight that finding gives us into what it means to 
constitute “one idea,” may be at least as important as the finding that short- 
term memory is limited to seven items plus or minus two (Miller 1994). 
The magical number one appears to be fundamental to the way the mind 
handles the flow of information through consciousness and language. 


Yet, cognitive limitations are not the only constraints at work. Research by 
Park (2002: 674) has shown that the IU is a “resource that participants in an in- 
teraction may use and manipulate to achieve their interactional goals.” He shows 
that substantive IUs, too, are subject not only to cognitive constraints but also to 
interactional constraints. In what follows, I contrast different aspects of the IU 
as they occur in conversation and in monological recordings. I show that genre 
has a substantial effect on the size of IUs, and detail the proportion of (Chafeian) 
IU types and the proportion of IUs with regard to their syntactic content, among 
other results. 

In this section, I investigate the grammatical structures that prosodic units 
typically contain. It is organized into four parts: In the first part, §3.3.1, I explore 
the grammatical structures found in CIUs - either singleton IUs or CIUs. In the 
second section, §3.3.2, I explore the grammatical structures found in embedded 
IUs of CIUs. The following section, §3.3.3, aims to compare the two from a syn- 
tactic point of view. In section §3.3.4, I review the findings from the analysis of 
tonal patterns occurring at the edges of prosodic units and revisit the evidence 
for recursive embedding of [Us in Totoli with the evidence from syntax. 
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3.3.1 Syntactic structures of singleton IUs and CIUs 


I briefly discussed above several studies that have addressed the question of what 
grammatical structures are typically found in IUs (Croft 1995, 2007, Tao 1996, 
Park 2002, Schuetze-Coburn et al. 1991, Schuetze-Coburn 1994, Matsumoto 2003, 
Iwasaki 1996, Iwasaki & Tao 1993, Wouk 2008). The studies vary substantially 
with regard to the data they are based on. Yet, it is to be expected that genre has 
a substantial influence on the proportions of IU types. The influence of genre has 
been anticipated by Du Bois (1987: 836): 


It is worth emphasizing that, while conversation may well be the more fre- 
quent genre, narrative is especially likely to display conditions of relatively 
high information pressure (...) The heavy information pressure demands 
in narrative may well give it significance beyond what it otherwise would 
have for the adaptive shaping of grammar in response to discourse needs. 


Despite this obvious fact, Croft (2007: 12) makes cross-linguistic claims about 
the syntactic nature of IUs by comparing the proportions of grammatical struc- 
tures reported in different studies. The present work is based on a corpus of 
conversational and monological recordings, enabling a comparison of the pro- 
portions of different types of CIUs within these two data types. The analysis 
presented here systematically investigates the influence of genre — monological 
versus conversational — and demonstrates its strong impact on the proportions of 
grammatical structures found in CIUs. Comparing other, more subtle subtypes 
of genre is also conceivable and is likely to yield slightly different results con- 
cerning the distribution of the syntactic nature of prosodic units. In this chapter, 
I focus on two broad categories—conversations versus monologues—only, as per 
Du Bois’s 1987 indication that the difference in information pressure is most pro- 
nounced in these categories. 


3.3.1.1 Methodology and coding conventions 


The analysis presented here examines the CIU, which refers to either a singleton 
IU or a Compound IU, as the primary unit of investigation. The study aims to 
determine the typical grammatical structures found within these units. As a re- 
sult, the CIU is the sole domain for coding. It is a prosodic unit that consists of 
either a singleton IU or a CIU made up of a sequence of IUs, which is delimited 
by boundary marking cues such as a pause, a break in pitch contour, and a pitch 
reset. 

This is illustrated in the three CIUs presented in examples (25a—25b). The sec- 
ond singleton IU in (25b) contains the noun sellenget ‘one basket’, which can be 
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analyzed either as an argument to the verb in the preceding singleton IU in (25a) 
or as part of the following CIU in (25c). However, because it constitutes its own 
singleton IU, (25b) is analyzed as a nominal IU, while both (25a) and (25c) are 
considered clausal CIUs. 
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Figure 3.45: Periogram with pitch track (in st) for example (25), speaker 
SP 


(25) a. ilantumnamo 
ni-lantum-0=na=m9 
RLS-bring.along-Uv-3s.GEN-CPL 
‘he brought (it) 

b. sellenget 
SO-RDP-lenget 
ONE-RDP-basket 
‘a basket’ 


c. sakena dei sapeda danna <ipoa> 
sake-0=na dei sapeda danna 
get.on-APPL=3S.GEN LOC cycle then 


“he put (it) on the bicycle and then” (pearstory 14 SP.019-21) > 


The structural relations between CIUs are only examined in relation to specific 
aspects of their internal distinctions, which will be discussed in the following 
sections. In order to provide adeguate context and facilitate understanding for 
the reader, I include the CIUs adjacent to the examples being discussed, although 
they may not always be explicitly elaborated upon. 
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3.3.1.2 Discussion of CIU types 


Following Tao (1996), I differentiate four categories: (a) clausal CTUs, (b) nominal 
CIUs, (c) interactional CIUs and (d) other minor types. I will discuss these in the 
following sections. 


3.3.1.2.1 Clausal CIUs 


In this study, a clausal CIU is defined as one that contains at least one predicate. 
Two types of clausal CIUs are distinguished: independent and dependent. More- 
over, independent CIUs are further classified based on the number of overtly 
expressed arguments. The definitions of these categories are detailed in the fol- 
lowing sections. 


Independent clausal CIUs The simplest form of an independent clausal CIU 
comprises a verbal predicate and a single overtly expressed argument, which 
may be either a lexical NP or a pronoun. An example of this is provided in (26), 
which illustrates a basic independent clause containing a preverbal pronominal 
argument isia ‘he’. 
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Figure 3.46: Periogram with pitch track (in st) for example (26), speaker 
IRN 


(26) isia nabbabag 
isia no-RDP-babag 
3s ST.RLS-RDP-crash.into 
‘he crashed into (it) (pearstory 15 IRN.009) > 
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In Totoli, the agent argument of the undergoer voice is often realized as a 
clitic pronoun on the verb (for an explanation of the voice system, see Riesberg 
et al. 2019, Riesberg 2014). In the conventions of this study, such constructions are 
categorized as simple clauses with one overtly expressed argument. For instance, 
consider the three IUs in (27). The initial singleton IU (27a) comprises a verb 
with the agent argument expressed as the enclitic =na '3s.GEN”. The second CIU 
(27b) features the same verb, but the undergoer argument is expressed as the 
lexical NP sape» itu ‘this hat’, while the agent argument is unexpressed. The third 
singleton IU (27c) also contains a verb with the agent argument realized as an 
enclitic on the verb. According to the coding conventions employed in this study, 
all three CIUs are classified as simple independent clausal CIUs with one overtly 
expressed argument. 
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Figure 3.47: Periogram with pitch track (in st) for example (27), speaker 
SNG 


(27) a. niuntudnamoko 
ni-untud-0=na=mo=ko 
RLS-bring-UV=3S.GEN=CPL=AND 


‘he brought (it) 

b. niuntudmoko sapeo itu 
ni-untud-0=mo=ko sapeo itu 
RLS-bring-UV=CPL=AND hat DIST 
‘(he) brought this hat’ 
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c. nibeennamai 
ni-been-0=na=mo=ai 
RLS-give-UV=3S.GEN=CPL=VEN 
‘he gave (it) (pearstory_17_Sng.101-103) > 
Oblique and core arguments are not distinguished in the analysis. Example 
(28) illustrates a clause where the negated verb noliitaan ‘meet’ is followed by 
an oblique argument introduced by the preposition takin ‘with’. Despite being 


an oblique argument, this CIU is still coded as a simple independent clausal CIU 
with one overtly expressed argument. 
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Figure 3.48: Periogram with pitch track (in st) for example (28), speaker 
RDA 


(28) nga noliitaan takin tau dak» 


inga noli-ita-an takin tau  dako 
NEG RCP.RLS-See-RCP.RLS with person big 
‘(I) haven't met my parents’ (lifestory_RDA_1.024) > 


Equational predications are analyzed as simple clauses with one nominal pred- 
icate and one argument. In example (29), the CIU consists of an equational clause 
with two elements, siritaku ia ‘my story’ and sirita tau pomo ‘the story of a per- 
son from the old times”. It is worth noting that Totoli does not use a copula. The 
CIU in example (29) is also considered a simple independent clausal CIU with 
one overt argument. 
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Figure 3.49: Periogram with pitch track (in st) for example (29), speaker 
RDA 


(29) sirita aku ia sirita tau pomo 
sirita aku ia siritatau pomoo 
story 1S.GEN PRX story person first 
‘my story is the story of a person from the old times’ 
(lifestory_RDA_ 1.014) > 


Totoli has two existential constructions. One construction involves a form of 
the existential predicate daan/kaddaan/dadaan “exist”. The other construction 
involves an existential prefix ko=. 

Examples of the existential prefix are given in the singleton [Us in (30a) and 
(30b). The bases badu ‘shirt’ and sampan ‘pants’ occur with ko= and in this case 
with the negator nga. Each constitutes a full clause. 


(30) a. nga kabadu 

inga ko=badu 
NEG EXIST-shirt 
‘there were no shirts’ 

b. nga kasampan 
inga ko=sampay 
NEG EXIST-pants 
‘there were no pants’ (lifestory RDA 1.032-034) > 


An example of a singleton IU containing a construction with an existential 
predicate is given in (31a). 
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Figure 3.50: Periogram with pitch track (in st) for example (30), speaker 
RDA 
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Figure 3.51: Periogram with pitch track (in st) for example (31), speaker 
FAH 
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(31) a. daan taisal 
daan taisol 
EXIST old.man 
‘there is an old man’ 


b. laalau monipu piir 
RDP-lau moN-tipu piir 
RDP-presently Av-pick pear 
‘(he) currently picks pears’ (pearstory 9 FAH.002-4) > 


The constructions presented in (30a), (30b) and (31a) are full clauses. The three 
singleton IUs are considered simple clausal CIUs that consist of one (existential) 
predicate and one overtly expressed argument. 

Clausal CIUs which contain more than one predicate and/or more than two 
overtly expressed arguments are referred to here as complex clausal CIUs. This 
includes CIUs containing a simple clause and a subordinate clause, e.g. a simple 
clause with a modifying relative clause or with an adverbial clause. Other cases 
are two coordinated clauses or a main clause with a complement clause parsed 
into one CIU. It is important to note that in Totoli, as well as in the local (Man- 
ado) Malay variety, a negated existential predicate is often used to negate entire 
clauses. In the counts used in this study, such constructions will appear as com- 
plex clausal CIUs since they involve two predicates. Such a construction is given 
in example (32). The clause parhatikanna tau ipanau ia ‘he notices the people 
below’ is negated with the negated existential predicate daan. 
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Figure 3.52: Periogram with pitch track (in st) for example (32), speaker 
RSTM 
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(32) ha inga daan parhatikanna tau ipanau ia 
ha inga daan parhatikan-0=na tau  i-panau ia 
INTJ NEG EXIST pay.attention-UV=3S.GEN person LOC-under PRX 
‘He didn’t notice the people below there (pearstory 12 RSTM.090-92) > 


An instance of two coordinate clauses parsed into one CIU is given in example 
(33). Note that no coordinating conjunction occurs. 
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Figure 3.53: Periogram with pitch track (in st) for example (33), speaker 
SYNO 


(33) giigii mellegesan giigii meggegesan 
RDP-gii mo-lelegesan RDP-gii mo-RDP-geges-an 
RDP-different Av-Lelegesasn RDP-different AV-RDP-rub-APPL 
“Singing Lelegesan is different from rubbing (your body). 
(lit. ‘Singing Lelegesan is different (and) rubbing is different.) 


(explanation-lelegesan_SYNO.032) » 


Example (34b) is another instance of a complex clausal CIU which involves an 
adverbial clause and its matrix clause parsed in a single CIU. 


(34) a. nadabu sapesna 
no-dabu  sapeo=na 
ST.RLS-fall hat=3s.GEN 
‘his hat fell’ 
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Figure 3.54: Periogram with pitch track (in st) for example (34), speaker 
FAH 


b. karena isia nagitai sapeana itu geiga noitana batu dei dulak 


karena isia nog-ita-i sapeo9=na itu geiga 
because 3s AV.RLS-see-APPL hat=3S.GEN DIST NEG 
no-ita-0=na batu dei dulak 


POT-See-UV=3S.GEN stone Loc front 
“Because he looks at the hat, he doesn't see the stone in front. 
(pearstory_9_FAH.026-27) > 


Dependent clausal CIUs These include various adverbial clauses that occur 
in separate CIUs. An example is provided in (35a), where the initial element is a 
subordinating conjunction indan ‘then’ that unambiguously indicates its depen- 
dent status. Unlike (34b) above, the adverbial clause and its matrix clause are in 
two separate CIUs. 


(35) a. indan nopulinmo doua llenget itumoko 
indgan no-pulin-mb doua RDP-lenget itu=mo=ko 
after sT.RLS-full=cpL two RDP-basket DIST=CPL=AND 
‘after the two baskets were full’ 


b. notumalibmoko tau gogoot toalan itu 
no-t<um>alib=mo=ko tau —s- RDP-goot toalan itu 
AV.RLS-<AUTO.MOT>pass.by=CPL=AND person RDP-hold goat DIST 
“a person passed by holding a goat’ ` (pearstory 12 RSTM.064-65) > 
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Figure 3.55: Periogram with pitch track (in st) for example (35), speaker 
RSTM 


In several instances, the subordinated status of a clause is not indicated by a 
subordinating conjunction. Nonetheless, the intonation and the context clearly 
indicate its status. This will be discussed further in section §3.3.2. An example is 
provided in the IU in (36a), which includes an adverbial clause of either temporal 
or, more likely, causal status. However, no subordinating conjunction specifies 
one interpretation over the other. 
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Figure 3.56: Periogram with pitch track (in st) for example (36), speaker 
SP 
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(36) a. moniiniligko dei dolago terus itu 
moN-RDP-silig=k9 dei dolago terus itu 
AV-RDP-glance=AND Loc girl then DIST 
‘he looked at the girl constantly’ 


b. sapeda nollumpakmoko dei batu 
sapeda no-RDP-lumpak=mo=ko dei batu 
bicycle ST.RLS-RDP-hit.against=CPL=AND LOC stone 
‘the bicycle crashes against the stone’ (pearstory_14_SP.027-28) > 


3.3.1.2.2 Nominal CIUs 


Nominal CIUs are composed of either a single NP or a relative clause. The latter 
is included here because it is equally referential, hence the label “nominal CIUs” 
rather than “NP-CIUs” (cf. Croft 2007: 13 and Tao 1996: 79). 

Croft (2007: 13) presents a basic categorization of nominal CIUs into three 


types: 


Independent: are those nominal CIUs which have no structural relation with any 
of the adjacent intonation units. 


Parallel: are separate CIUs containing “conjoined or appositive NPs” (Croft 2007: 
13). 


Arguments: are nominal CIUs that have a structural relationship with a neigh- 
boring CIU; i.e. they can be analyzed as an argument to a predicate of an 
adjacent clausal CIU. 


The immediately adjacent CIUs are taken as the domain for category-internal 
classification of nominal IUs. A nominal CIU is considered an argument CIU if 
it can be analyzed as an argument of a clausal CIU immediately preceding or 
following it. Relative clauses with a head noun in the immediately preceding 
CIU are classified as parallel. Most free relative clauses that constitute their own 
CIUs can be analyzed as arguments and are classified as such; otherwise, they 
are classified as independent. Examples of all of these subtypes of nominal CIUs 
are provided below. 


Nominal argument CIUs Example (37a) is an instance of a nominal argu- 
ment CIU. It serves as an argument to the existential predicate in the subsequent 
CIU in (37b). 
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Figure 3.57: Periogram with pitch track (in st) for example 37, speaker 
ZBR 


(37) a. meman sistim kokoluargaan 
meman sistim ko-koluarga-an 
in.fact system NR-family-NR 
‘in fact the family system’ 

b. musti dadaanps 
musti RDP-daan-p» 
have.to RDP-EXIST-INCPL 
‘has to remain’ 
c. mekelegps 
mo-keleg=po 
ST-strong=INCPL 
‘stay strong’ (explanation-wedding-tradition_ZBR.249-251) > 


A nominal CIU may also contain an NP with a modifier. For instance, in (38b), 
the CIU contains the NP adfokaat ‘avocado’ and the modifying relative clause 
anu nilantumnako ia ‘which he brought’. This CIU is a nominal argument CIU 
as it serves as the argument of the verb in the following clausal, singleton IU in 
(38c). 


(38) a. nollumpak dei batu 
n9-RDP-lumpak dei batu 
ST-RDP-hit.against LOC stone 


‘(he) crashed against the stone’ 
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Figure 3.58: Periogram with pitch track (in st) for example (38), speaker 
SP 


b. adfokaat anu nilantumnako ia 
alpukaat anu ni-lantum-0=na=ko ia 
avocado REL RLS-bring.along-Uv=3s.GEN=AND PRX 
‘the avocados which (he) brought’ 
c. nakakabmoko 
no-kakab=mo=ko 
ST.RLS-pour=CPL=AND 
‘scattered/poured’ (pearstory_11_SP.027-28) > 


CIUs consisting of a headless relative clause are argument CIUs if they serve 
as an argument to an adjacent CIU. Example (39b) is an instance of a headless 
relative clause phrased as a separate CIU. It serves as the argument to the verb 
in the preceding clausal CIU (39a). 


(39) a. sukati itaita 
sukat-i ita-i=ta 
try-UV see-APPL=2S.GEN 
‘try to look for’ 
b. anu saasalu dei puun kaju 
anu RDP-salu dei puun kaju 
REL RDP-facing Loc tree wood 


‘the one facing the tree’ (spacegames_sequence2_KSR-SP.198-199) > 
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Figure 3.59: Periogram with pitch track (in st) for example (39), speaker 
SP 
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Figure 3.60: Periogram with pitch track (in st) for example (40), speaker 
SP 


(40) a. bali nnea ia mollindzon sasaakan 
bali nenea ia mo-RDP-lindzon sasaakan 
so today prx AV-RDP-gather everybody 
‘so today everybody gathers’ 
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b. ssaakan tau montoliusat 
sasaakan tau =montoli-usat 
all person BE.RELATED.AS-related 
‘all the relatives’ 


c. montoliaman 
montoli-aman 
BE.RELATED.AS-father 
“the relatives of the father” 
(explanation-wedding-tradition_ZBR.023-25) > 


The singleton IU in (41c) is an example of a relative clause with its head in 
the preceding CIU, shown in (41b). It is analyzed here as a nominal CIU of the 
parallel type, as it modifies the head NP puun kaju “the tree’ in the preceding 
CIU. 
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Figure 3.61: Periogram with pitch track (in st) for example (41), speaker 
SP 


(41) a. bali singaian nibeenannako alpukaat kalanena itu 


bali singaian ni-been-an=na=ko alpukaat kalangena 

so friend RLS-give-APPL=3S.GEN=AND avocado a.moment.ago 
itu 

DIST 


‘so the friends who were given the avocado earlier’ 
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b. nallakamoko notumalibmo niko dei alun puun kaju 
no-RDP-lako=mo=ko no-t<um>alib=mo 
AV.RLS-RDP-walk-cpl-and AV.RLS-<AUTO.MOT>pass.by=CPL 
poni=ko dei alun puun kaju 
again=AND LOC under tree wood 
‘they walked past, below the tree’ 


c. anu lau penek tau pagauan ia dei alung alpukaat ia 
anu lau penek-0 tau po-gauan ia dei alung alpukaat 
REL presently climb-uv person GER-garden PRX Loc under avocado 
ia 
PRX 
‘which was just climbed up by the farmer under the avocados’ 


(pearstory 11 SP.043-45) > 


Independent nominal CIUs These are nominal CIUs - often singleton IUs — 
which cannot be analyzed as bearing a structural relation with any adjacent CIU. 
They often perform a topic-introducing function (Croft 2007: 13), as in example 
(42b). 
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Figure 3.62: Periogram with pitch track (in st) for example (42), speaker 
FAH 
(42) a. hm 
‘hm’ 
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b. kedadianna 
kedgadian=na 
event=3s.GEN 
‘the situation’ 


c. daan taisol 
daan taisol 
EXIST old.man 


‘there is an old man’ (pearstory_9_FAH.001-3) > 


3.3.1.2.3 Interactional CIUs 


This category includes CIUs - often singleton [Us — that consist of an interjection 
such as eh, mm, io and other discourse markers; see example (43b). 
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Figure 3.63: Periogram with pitch track (in st) for example (43), speaker 
RD 


(43) a. dsuamp anu nopool 
doua=mo anu no-pool 
two=CPL REL ST.RLS-full 
‘two are already full’ 
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c. sia nemenek ulan magalai poni manuanan dei sallo 
isianoN-penek ulay mog-ala=ai poni moN-suan-an dei 
3s AV.RLS-climb repeat Av-fetch=vEN again Av-fill-APPL LOC 
sallo 
basket 
‘he climbs again, to take again and put it in the basket’ 


(pearstory_13_RD.025-27) > 


The filler element anu is considered an interactional singleton IU only if it 
occurs as a bare root in a separate singleton IU such as in (44b). In the presence 
of verbal morphology, it is coded as clausal, as it is usually smoothly integrated 
into the clause structure. See the two CIUs in examples (45a)—(45b). 
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Figure 3.64: Periogram with pitch track (in st) for example (44), speaker 
RSM 


(44) a. tutunmo 
tutun=mo 
burn=COMPL 
‘Burn me!’ 

b. anu 
anu 
FILL 


...thinggy... 
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c. 0, tiana, geiga, kudabuan dei aga 
o tiana geiga ku=dabu-an dei ogo 
INT] QUOT NEG 1sG=throw-Uv LOC water 
‘Oh no, she says, I will throw you in water!’ > 
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Figure 3.65: Periogram with pitch track (in st) for example (45), speaker 
SELP 


(45) a. a sagaat naanuanmo alpukaat ia 
aso-gaat no-anu-an=m9 alpukaat ia 
a ONE-part AV.RLS-FILL-APPL=CPL avocado PRX 
‘half of the avocados thingied’ 


b. sagaat naanuanmo tau nanako ia 


so-gaat no-anu-an=mo tau non-tako ia 
ONE-part AV.RLS-FILL-APPL=CPL person AV.RLS-steal PRX 
‘half of it was thingied by the thief’ (pearstory_36_SELP.287) > 


3.3.1.2.4 Others 


This category includes adverbs and connectives that occur as single CIUs, as 
well as prepositional phrases. Additionally, it encompasses fragmentary CIUs 
and instances of code-switching. 

Frequently, an adverb or connective itself forms a separate CIU—often a sin- 
gleton IU. The most commonly used items in Totoli are bali ‘so’, indan ‘then’, 
antuknako ‘that is’, tapi ‘but’, and danna ‘then’. An example is provided in (46a). 
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Figure 3.66: Periogram with pitch track (in st) for example (46), speaker 
SP 


a. 


b. 


bali 
bali 
so 


D 3 


so 
pogitata anu batu 

pog-ita-O=ta anu batu 
sF-look.for-UV=2S.GEN REL stone 
“look for a stone” 

kaddaan buubuna 

ko-rpp-daan  RDP-buna 
EXIST-RDP-EXIST RDP-flower 


“(which) has flowers” > 


Prepositional phrases involve a preposition and a nominal element in adverbial 


function forming a single CIU, as in example (47b). 


(47) 
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a. 


namo nallakoan baki tuku 

namə n2-RDP-lako-an baki tuku 
only AV.RLS-RDP-walk-APPL head knee 
‘only walking on knees’ 


dei dalan babi 

dei dalan babi 

Loc road pig 

‘on a secret path’ (lifestory_RDA_1.072-74) > 
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Figure 3.67: Periogram with pitch track (in st) for example (47a), 
speaker RDA 


As per Tao (1996: 72) and Wouk (2008: 150), the CIUs grouped under “Other” 
include oblique arguments and adverbial adjuncts. Differentiating between these 
two types of CIUs in the corpus is relatively straightforward, and only a few 
ambiguous cases were encountered. One simple distinguishing characteristic is 
optionality. Quirk et al. (1985: 50) argues that while oblique arguments are usually 
obligatory, adverbial adjuncts 


“may be regarded, from a structural point of view, largely as “optional extras”, 
which may be added at will, so that it is not possible to give an exact limit 
to the number of adverbials a clause may contain” 


Various other elements occur as separate CIUs and cannot be classified under 
any of the categories mentioned earlier. These are also included in the category 
“Other”. One example is negatives, as shown in example (48a). 


(48) a. 2 inga inga 
» inga inga 
O NEG NEG 
'oh no no” 
b. inga daan kan makadaang maggalima ia inga 
inga daan kan moko-do9ng mo-Rpp-gali-mb ia inga 
NEG EXIST perhaps sT.Av-want AV-RDP-stop=CPL PRX NEG 


‘there is no longing to stop this, no’ 
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Figure 3.68: Periogram with pitch track (in st) for example (48), speaker 
RDA 


3000 


(explanation-wedding-tradition ZBR.341-342) » 


Other units are numerals, shown in (49c), and guotative elements, such as in 


(49b). 
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Figure 3.69: Periogram with pitch track (in st) for example (49), speaker 
RDA 


(49) a. isia kodoon modumakit 
isia ko=d90n  mo-d<um>akit 
3s EXIST-want AV-<AUTO.MOT>across 


‘he wants to cross’ 
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b. tiana 
tigana 
QUOT 
‘he says’ 
c. sabatu 
sabatu 
sabatu 
‘one’ (story-monkey-crocodile_RSM.030-32) > 


3.3.1.3 Distribution and discussion 


As an initial step, I examine the distribution of the four CIU types: (a) clausal 
CIUs, (b) nominal CIUs, (c) interactional CIUs, and (d) other minor CIU types. 
Figure 3.70 displays the frequency distribution of these CIU types in the corpus, 
showing both the overall distribution and the distribution within the conversa- 
tional and monological data. 
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Figure 3.70: Frequency distributions of the four broad categories of 
CIUs within conversational and monological recordings 


In the entire corpus, clausal CIUs account for 52.7%, nominal CIUs for 15.9%, 
and interactional CIUs for 13.7%. Other structures make up 17.6% of the corpus. 
The difference between conversational and monological data primarily concerns 
clausal and interactional CIUs. In conversational data, the proportion of clausal 
CIUs is 12.3 percentage points lower, while the proportion of interactional CTUs 
is 11.4 percentage points higher. 

In Totoli, the clause is a major type of construction that constitutes a CIU. To 
further examine this type, I present the distribution of various types of clausal 
CIUs in Figure 3.71. Dependent clausal CIUs (cf. §3.3.1.2.1) are displayed in the 
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right-hand columns. Independent clausal CIUs (cf. §3.3.1.2.1) are further subdi- 
vided into simple clausal CIUs (with zero, one, or two overtly expressed argu- 
ments), and complex clausal CIUs (involving more than one predicate and/or 
more than two arguments). 


|) conversational monological 
n= 521 n= 987 


a 
a 


IA 


E 
o 


EI 
EI 


distribution of occurrences in % 


1 2 complexdependent 0 1 2 complexdependent 


Figure 3.71: Freguency distributions of subcategories of clausal CIUs 
within conversational and monological recordings 


In both conversational and monological data, there is a notable proportion of 
clausal CIUs with one overtly expressed argument (47.84 in conversational data, 
41.6% in monological data), as well as a high proportion of elliptical CIUs, with 
no overtly expressed verb (33.8% in conversational data, 20.2% in monological 
data). It is also noteworthy that there are no CIUs containing a dependent clause 
in conversational data. 

Figure 3.72 offers a detailed breakdown of the various types of nominal CIUs. 
This includes nominal argument CIUs (cf. §3.3.1.2.2), parallel nominal CIUs (cf. 
§3.3.1.2.2), and independent nominal CIUs (cf. $3.3.1.2.2). 


all conversational monological 
n= 456 n= 155 n= 301 


73 


distribution of occurrences in % 


Argument Independent Parallel Argument Independent Parallel Argument Independent Parallel 


Figure 3.72: Distributions of argument, independent and parallel nom- 
inal CIUs within conversational and monological recordings; numbers 
are rounded to one decimal place. 
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The data show a high number of independent nominal CIUs and substantial 
differences between conversational and monological data: 61% independent nom- 
inal CIUs in conversations and 35.5% in monological data. 

Table 3.5 summarizes the results. 

Table 3.5: Summary of distributions of different CIU types and sub- 


categories: total proportions are given in the left-hand columns, total 
numbers in the right-hand columns. 


all conversational monological 
clausal 52.7% 1508 45.7% 520 57.4% 987 
0 13.1% 375 15.4% 176 11.6% 199 
1 23.1% 660 21.8% 249 243.9% 411 
2 5.4% 154 3.2% 37 6.8% 117 
complex 8.7% 248 5.2% 59 11.0% 189 
dependent 2.5% 71 0.0% 0 4.31% 71 
nominal 15.9% 456 13.6% 155 17.5% 301 
argument 6.0% 172 3.4% 39 7.7% 133 
independent 7.1% 202 8.3% 95 6.2% 107 
parallel 2.9% 82 1.3% 21 3.5% 61 
interactional 13.7% 393 20.6% 235 9.2% 158 
others 17.6% 4504 20.2% 230 15.9% 274 
adv. & con. 3.4% 96 24% 27 4.0% 69 
code-switching 1.8% 52 1.5% 17 2.0% 35 
fragments 2.1% 61 3.6% 41 1.2% 20 
further 3.8% 108 6.4% 73 2.0% 35 
PP 3.1% 89 3.2% 37 3.0% 52 
uncodable 3.4% 98 3.1% 35 3.7% 63 
TOTAL 100% 2861 100% 1141 100% 1720 


It is important to consider that Totoli is an endangered, understudied language. 
While Riesberg (2014), Himmelmann & Riesberg (2013) and Riesberg et al. (2021) 
provide detailed discussions of the major aspects of the verbal morphology, other 
aspects of its grammar are still not fully worked out. In some cases, I had to 
make coding decisions that readers may or may not agree with. One example 
relevant to the current discussion is the coding of negation with a form of the 
negated existential predicate ko=/daan/kaddaan. This form of negation occurs 
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frequently in both Totoli and the local (Manado) Malay variety. A simple clause 
that is negated with an existential predicate will appear as a complex clause in the 
count, as it involves two predicates: the existential predicate and the predicate 
of the negated clause. Such decisions must be made for each language based on 
its unique grammar, and need to be considered when comparing reported data. 

The investigation presented above aimed to answer the question of what gram- 
matical structures a CIU typically contains. I will now turn to a discussion of 
grammatical structures found in embedded IUs of CIUs. 


3.3.2 Syntactic structures of embedded IUs of CIUs 


In this section, I will be discussing several aspects related to the grammatical 
units found in embedded IUs of CIUs. The analysis is straightforward and is not 
couched in the framework and coding conventions used above in §3.3.1. Here, I 
use a limited range of grammatical unit types which are sufficient to describe the 
majority of structures found, such as noun phrases, verbs, prepositional phrases, 
adverbial clauses, and relative clauses. To briefly illustrate some of the main as- 
pects, I have provided illustrative examples below. 

Noun phrases typically constitute their own (embedded) IUs, and for NPs in 
preverbal position, this is observed with consistency. An example is provided in 
(50), with the periogram shown in Figure 3.73. The simple one-word argument 
NP ‘sagaat meaning ‘the half’ is parsed into a separate embedded IU, as indicated 
by the pitch rise on the final syllable. It is followed by an IU containing the verb 
and a following prepositional phrase. The “|” symbol indicates a boundary of an 
embedded IU. 


(50) sagaat | madabumai dei buta 
so-gaat mo-dabu=mo=ai dei buta 
ONE-part sT-fall=CPL=VEN LOC earth 
‘half of it fell to the ground’ (pearstory_14_SP.007) > 


Headless relative clauses are consistently parsed as their own IUs when in 
preverbal position. For example, consider the CIU in example (51) and its corre- 
sponding pitch contour shown in Figure 3.74. The initial IU of the CIU comprises 
the headless relative clause anu ampi koloanan meaning ‘the one on the right side’. 
This is then followed by an IU containing both the verb saasalu ‘facing’ and the 
pronominal NP kita ‘us’. 
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Figure 3.73: Periogram with pitch track (in st) for example (50), a CIU 
consisting of two embedded IUs, speaker SP 
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Figure 3.74: Periogram with pitch track (in st) for example (51), a CIU 
consisting of two embedded IUs, speaker SP 
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(51) anu ampil koloanan | saasaluai kita 
anu ampi koloanan RDP-salu=ai kita 
REL part right RDP-facing-VEN 25 
‘the one on the right-hand side is facing you’ 


(spacegames_sequence4_KSR-SP.231) > 


In examples (50) and (51) above, the verb is grouped together in one IU with the 
following constituent, such as the prepositional phrase in (50) and pronominal 
argument NP in (51). Such cases are common. However, in many instances, the 
verb and possible following adverbs are grouped as separate IUs, as seen in ex- 
ample (52), where the verb bagulna meaning “he was beating’ and the following 
adverb poni ‘again’ constitute the first IU of the CIU and are grouped separately 
from the following argument NP kalibomban meaning ‘butterfly’. 
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Figure 3.75: Periogram with pitch track (in st) for example (52), a CIU 
consisting of two embedded IUs, speaker RSM 


(52) bagulna pani | kalibomban 
bagul-0=na poni kalibomban 
beat-UV=3s.GEN again butterfly 
‘he was beating the butterfly again” (story-monkey-butterfly_RSM.053) > 


In fact, the same construction can be found with both realizations: with the 
verb and the following argument parsed together in one IU or separately in one 
IU each. Below are two instances of a nearly identical CIU which involves a verb 
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and its oblique argument. In the first example (53) and its corresponding visual- 
ization in Figure 3.76, the verb and the oblique argument form separate, embed- 
ded IUs. This is clearly visible by the pitch rise on the last syllable of the verb 
nolitaan meaning ‘to meet’. 
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Figure 3.76: Periogram with pitch track (in st) for example (53), a CIU 
consisting of two embedded IUs, speaker RDA 


(53) inga noliitaan | takin tau dak» 


inga noli-ita-an takin tau dako 
NEG RCP.RLS-see-RCP.RLS with person big 
‘(I) didn't meet (my) parents’ (lifestory_RDA_1.124) > 


Example (54) features an almost identical construction. However, in this case, 
the verb and its oblique argument constitute a single (embedded) IU together. 
There is no pitch rise observed on the last syllable of the verb nolitaan meaning 
‘to meet’, which would typically indicate an IU boundary. The corresponding 
periogram with pitch track (in st) is provided in Figure 3.76. 


(54) danna | noliitaan takin tau dak» 


danna noli-ita-an takin tau dako 
then RCP.RLS-see-RCP.RLS with person big 
“Then, (I) met (my) parents’ (lifestory RDA 1.115) > 


Other instances involve an NP with a modifying relative clause that can ei- 
ther occur together in one IU or split into two IUs. The latter is more common 
when in postverbal position. For instance, in example (55) and its visualization 
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Figure 3.77: Periogram with pitch track (in st) for example (54), a CIU 
consisting of two embedded IUs, speaker RDA 


in Figure 3.78, the first word lau ‘currently’ appears in a separate, embedded IU 
at the beginning of the CIU. It is followed by an IU containing the relative clause 
anu lau suludan tau moane ana “which is being pushed by the man’, another IU 
with the prepositional phrase dei dulak »gobbunna “in front of the well’, and the 
final IU containing the verb moitaku T see’. The IU containing the relative clause 
spans six words or twelve syllables, and the IU with the prepositional phrase 
has four words or seven syllables. Such lengthy IUs are not uncommon in To- 
toli, as demonstrated by the examples in §3.2 (e.g. example (15) in Figure 3.29, 
and example (1) in Figure 3.8). This indicates that the (embedded) IU in Totoli 
differs significantly from the prosodic word and Accentual Phrase in Korean, as 
discussed in §3.2.5. 


(55) lau | anu lau suludan tau moane ana | dei dulak agabbunna | moitaku 
lau anu lau sulud-an tau ana dei dulak sgobbun=na 
presently REL presently push-APPL person MED Loc front well=3s.GEN 
mo-ita-0=ku 
ST-see-UV=1s.GEN 
‘what is currently pushed by the man in front of the well, as I see it: 


(QUIS-focus_SP.026) > 


Example (56) and its visualization in Figure 3.79 illustrate a long and complex 
CIU, where the various grammatical units are very regularly chunked into (em- 
bedded) IUs. The CIU commences with an IU containing the connective adverb 
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Figure 3.78: Periogram with pitch track (in st) for example (55), a CIU 


consisting of four embedded IUs, speaker SP 


bali ‘then/so’, which typically constitutes its own IU. The following IU contains 
the subject NP tau ‘person’ along with its set of modifiers. This is succeeded by 
an IU containing the verb nanaumai ‘to go down’. The subsequent IU contains a 
prepositional phrase that is repeated twice. The first instance involves the dum- 
my/filler element anuna, and the second instance involves the intended/repaired 
prepositional phrase ulai puun alpukaat ‘from the avocado tree’. The CIU con- 
cludes with an IU containing a relative clause that further modifies the noun in 


the prepositional phrase. 
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Figure 3.79: Periogram with pitch track (in st) for example (56), a CIU 


consisting of six embedded IUs, speaker SELP 
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(56) bali | tau <na> nonipu togu alpukaat ia | nanaumai | ulai anuna | ulai <p> e 
puun alpukaat ia | anu toaka itipuna 
bali tau noN-tipu  togu alpukaat ia 
so person AV.RLS-pick possession avocado PRX 
no-nau=mo=ai uli=ai anu=na uli=ai e puun 
AV.RLS-go.down=CPL=VEN from=VEN FILL=3s.GEN from-VEN INTJ tree 
alpukaat ia anutooka  ni-tipu-0=na 
avocado PRX REL finished RLS-pick-Uv=3s.GEN 
‘so, the person picking, the owner of the avocados, goes down from the 
avocado tree that he was just picking. (pearstory_36_SELP.047) > 


In §3.2.5, I discussed the case of Heads in a THL, which are immediately fol- 
lowed by additional syntactic material. In this case, the Head of the THL construc- 
tion constitutes an IU in CIU-initial position. Adverbial and relative clauses that 
fulfill this role can occasionally be quite long, consisting of five or more words. 
Two examples are given below in (57)-(58). Example (57) and its visualization in 
Figure 3.80 is a CIU with an extensive initial adverbial clause, phrased in two IUs. 
The adverb laalau ‘currently’ is parsed in a single IU with the remainder of the 
adverbial clause parsed in a single long IU, containing 6 words / 13 syllables. 
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Figure 3.80: Periogram with pitch track (in st) for example (57), a CIU 
consisting of three embedded IUs, speaker SELP 


(57) lalau | isia dei babo andan lau monipu | notumalib 
RDP-lau  isia dei babo ondan lau  mo-tipu 
RDP-while 3s Loc above ladder while Av.RLS-pick 
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no-t<um>alib 
AV.RLS-<AUTO.MOT>pass.by 
“While he was on the ladder picking (pears),(he) passed by’ > 
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Example (58) and its realization in Figure 3.81 is an instance of a CIU with two 
initial [Us containing a complex adverbial clause. 
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Figure 3.81: Periogram with pitch track (in st) for example (58), a CIU 
consisting of three embedded IUs, speaker SP 


(58) kaasikan | magiigitai manana dolago itu | sapeda | nallumpak 
keasikan mog-RDP-ita-i manana dolago itu sapeda 
excitement AV.NRLS-RDP-watch-APPL child girl  DisT bicycle 
no-RDP-lumpak 
ST-RDP-hit.against 
‘because of his excitement in looking at the girl, his bicycle crashed 
(against the stone) (pearstory 11 SP.025) » 


The examples demonstrate regularities in chunking. However, more insightful 
are instances that appear to contradict these regularities. 

In many CIUs, the first word forms its own IU, which is also noted by Him- 
melmann (2018: 361). In fact, 31% of all CIUs in the corpus containing more than 
one word have their first word phrased as a separate embedded IU. This is often 
because words in the initial position of a CIU are connectives or one-word noun 
phrases that are regularly phrased as a single IU. See examples (50), (55), (56). 

As explained above, adverbial clauses, relative clauses, and prepositional phrases 
are typically not further chunked, regardless of their length, as demonstrated by 
the long embedded IUs in examples (57) and (58). However, there are instances 
where the initial elements constitute a separate IU. Examples include the initial 
relative particle anu of a relative clause, the initial preposition of a prepositional 
phrase, and the initial conjunction of an adverbial clause. Consider (57) above, 
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with its periogram in Figure 3.80. The initial word laalau ‘while’ of the adverbial 
clause is phrased as its own IU. The following IU contains a pronoun as well as 
a prepositional phrase and a predicate. These are not further chunked, as is the 
case for adverbial clauses (see also example (55)). 

In many adverbial clauses, such as examples (57) and (58), the first element is 
phrased separately. When an initial element of a relative clause, adverbial clause, 
or prepositional phrase is phrased separately, it often co-occurs with a hesitation 
pause. Thus, boundary placement appears to serve as a planning device. Two 
examples are provided below. 

The CIU in example (59) contains a verb and a lengthy prepositional phrase. 
The initial verb is phrased as a separate IU. However, in the following preposi- 
tional phrase, the initial preposition dei is phrased separately from the rest of the 
CIU and is followed by a short CIU-internal pause. The periograms with pitch 
tracks (in st) are shown in Figure 3.82. 
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Figure 3.82: Periogram with pitch track (in st) for example (59), a CIU 
consisting of three embedded IUs, speaker FAH 


(59) niuntudnako | dei | tau nanak> manana nnako buno piir itu 
ni-untud-0=na=ko dei tau noN-tako manana noN-tako 
RLS-bring-UV=3S.GEN=AND LOC person AV.RLS-steal child AvV.RLS-steal 
buno piir itu 
fruit pear DIST 
‘he brought (it) to the person who stole, the child who stole the pears’ 


(pearstory_11_SP.025) > 
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Another example is provided in (60), and its visualization is shown in Fig- 
ure 3.83. In this example, the initial preposition untuk ‘for’ is also phrased sepa- 
rately, followed by a CIU-internal hesitation pause. 
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Figure 3.83: Periogram with pitch track (in st) for example (60), a CIU 
consisting of two embedded IUs, speaker SP 


(60) untuk | panarimaan tau mongouma ia 
untuk poN-tarima-an tau mo-ngo-uma ia 
for | NMLZ-accept-NMLZ person ST-COLL-arrived PRX 


‘for the reception of the visitors’ (pearstory_11_SP.025) > 


Instances of an embedded IU containing elements of two separate clauses are 
very rare. One such instance is presented in example (61), which is visualized 
in Figure 3.83. This example comprises an adverbial clause and two coordinated 
main clauses and involves an embedded IU containing grammatical units of both 
main clauses. The CIU starts with an IU containing the adverbial clause daan 
nadabu “after he fell’. The following verb nolimulas ‘scatters’ is uttered as its own 
embedded IU. The subsequent argument alupkaat ‘the avocado’ is not parsed 
into its own IU, but the pitch drops continuously despite the clause boundary. 
An IU-final boundary tone is placed on the adverb gaake ‘also’. As a result, this 
IU contains the argument NP of the first clause and the verb of the second clause. 
The argument NP of the second clause buludna ‘his shinbone’ forms the final IU 
of the CIU. 
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Figure 3.84: Periogram with pitch track (in st) for example (61), a CIU 
consisting of four embedded IUs, speaker SELP 


(61) daan nadabu | nalimulas | alpukaat nsangotmoks gaake | buludna 
daan no-dabu  no-l<um>elas alpukaat 
later ST.RLs-fall AV.RLS-<AUTO.MOT>scatter avocado 
no-ongot=mo=ko gaake bulud=na 
ST.RLS-hurt=CPL=AND too  shin=3S.GEN 
‘after (he) fell, the avocados scatter and his shin hurts too’ 


(pearstory 36 SELP.025) > 


In this section, I have illustrated some key aspects regarding the chunking 
of CIUs into IUs and their regularities. The syntactic content of IUs typically 
constitutes a complete grammatical unit, but there are rare instances where two 
independent grammatical units occur within the same IU. Additionally, certain 
units such as adverbial clauses, prepositional phrases, and relative clauses may 
have their initial element phrased as a separate IU. The realization of example 
(61) provides an interesting case in point. In most cases, speakers consistently 
mark the right-edge boundary of a grammatical unit with a prosodic boundary, 
but boundary placement is optional. In this example, the speaker did not place a 
boundary at the end of the NP alpukaat ‘avocado’, which is the end of the first 
main clause. Consequently, the IU consists of two grammatical units that belong 
to two different clauses. 
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3.3.3 Comparing the syntax and prosody of embedded IUs of CIUs 
with singleton [Us 


In this section, I compare the syntactic content of embedded IUs of CIUs with 
singleton IUs, investigating the differences or similarities between the two with 
regard to the grammatical structures they contain. The analysis of phrase-final 
tonal patterns in §3.2.1 and §3.2.2 revealed that prosodic patterns at the end of 
embedded IUs are essentially the same as those that occur in CIU-final position, 
providing evidence that these prosodic units are essentially of the same type. 

To further explore this assumption, I compare grammatical structures typically 
found in embedded IUs of complex CIUs with those found in singleton IUs. I 
illustrate this with two examples below. 

Compare example (62) and its visualization in Figure 3.85 with example (63) 
and its visualization in Figure 3.86. Both contain a similar structure: the connec- 
tive bali ‘so’, followed by the verb pogitata ‘look for’ and a subsequent headless 
relative clause in undergoer function. The difference is that in example (62), the 
connective is parsed into one CIU together with the main clause and constitutes 
the initial embedded IU of the CIU. In example (63), however, the connective ap- 
pears in a separate IU, clearly demarcated by further boundary phenomena, such 
as pitch reset, and final syllable lengthening. Note, however, that tonal targets 
and the tonal contours of bali ‘so’ are identical in both instances. 


105 - 
: a A 
SËCH Ia EK \ 
£ 
D 
E d 
5 95- 
a bali pogitata anu babi 

90 i U U I U I 
0 500 1000 1500 2000 
Time 


Figure 3.85: Periogram with pitch track (in st) for example (62), a CIU 
consisting of three embedded IUs, speaker SP 
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(62) bali | pagitata | anu babi 
bali pog-ita-O-ta anu babi 
so SF-look.for-uv-2s.GEN REL pig 


‘so, look for the pig. (spacegames seguence4 KSR-SP.012) 
N 
= 105 - Q 
S $ eg d A 
o 
2 
% 100- / IN 
£ AS 
a bali pogitata anu batu 
95 > 
0 1000 2000 3000 
Time 


Figure 3.86: Periogram with pitch track (in st) for example (63), two 
CIUs, speaker SP 


(63) a. bali 
bali 
so 


H H 


SO 


b. pagitata | anu batu 
pog-ita-O=ta anu batu 
sF-look.for-UV=2S.GEN REL stone 
“look for the stone. (spacegames_sequence4_KSR-SP.012) > 


In both examples above, the tonal targets of the connective bali are the same. 
In example (62), pitch is interpolated between the IU-final H% boundary tone 
located on the last syllable of the connective bali ‘so’ and the tonal targets of the 
second IU pogitata ‘look for’. In example (63), the connective forms its own sin- 
gleton IU and pitch is reset at the beginning of the then CIU-initial word pagitata 
‘look for’. 

A second example is given in (64) and (65). Again, in (64), the left-dislocated 
topic NP ota ‘car’ is parsed into one CIU with the following clause. In (65), the 
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same NP oto ‘car’ occurs as a separate singleton IU, clearly visible by the reset in 
pitch and final syllable lengthening. Yet, tonal targets and the shape of the pitch 
contour of the NP ətə ‘car’ are the same in both instances. 


~ 105 - y 
N 
I 
Š 
S 100- \ 
g 
7 2 
£ 95- = 
$ SRY 
H / al I lud: ê r 
90 & 0 na au suludanna ana 
0 500 1000 1500 
Time 


Figure 3.87: Periogram with pitch track (in st) for example (64), a CIU 
consisting of two embedded IUs, speaker SP 


(64) ətə] ana lau suludanna ana 


oto ana lau sulud-an-na ana 
car MED presently push-APPL=3S.GEN MED 
‘it is a car that is being pushed by him’ (QUIS-focus_SP.012) > 
(65) a. ato 
oto 
car 


‘(it is a) car’ 

b. anu laalau suludan tau moane dei dulak sgabbun ana 
anu RDP-lau sulud-an tau moane dei dulak 9gobbun ana 
REL RDP-presently push-APPL person man Loc front well MED 


‘that is currently pushed by the man in front of the well’ 
(QUIS-focus_SP.041-42) > 


These two example pairs illustrate that the same grammatical structures may 
occur as either singleton IUs or an embedded IU in a complex CIU. In the exam- 
ples above, the tonal specifications remain the same. 
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5 105 - H 
in 
S 100- / \ 
v D 
5 LA 
EPA NA We 
$ Pa 
a t al laal. lud: ta moan dei | dulak 393 bbi 

90 - 

0 1000 2000 3000 


Time 


Figure 3.88: Periogram with pitch track (in st) for example (65), speaker 
SP 


Based on the corpus, I conducted a quantitative comparison of grammatical 
structures found in embedded IUs of CIUs with those found in singleton IUs. Fig- 
ure 3.89 illustrates the comparison. Embedded IUs of complex CIUs (exemplified 
here as ‘y1’, ‘Y2’, ‘y3’) are given on the right-hand side and singleton IUs which 
consist of a single IU (exemplified here as ‘x’) are displayed on the left-hand side. 


CIU CIU 
IU 4 » IU IU IU 
x Ta KE ¥3 


Figure 3.89: Comparison of syntactic content of embedded IUs and sin- 
gleton IUs 


Figure 3.90 compares the distribution of grammatical units found in singleton 
IUs with the distribution within embedded IUs of CIUs. The seven syntactic cat- 
egories account for 82% of the 3191 embedded IUs and 78% of all 1005 unchunked 
IUs in the corpus. 

The freguency distribution in Figure 3.90 shows that all seven major gram- 
matical units found in embedded IUs Ce, wa, ‘y3’) also occur as unchunked 
singleton IUs (‘x’), although with varying distribution. Specifically, 82% of all 
embedded IUs correspond to one of the seven categories of grammatical units, 


141 


3 Corpus-based approaches 


singleton IUs embedded IUs of CIUs 
78% of n= 1005 82% of n= 3191 


A 
o 


32.8% 
(n=-1046) 


31.8% 
(n= 320) 


w 
o 


9:9% 
(n= 316) 


distribution of occurrences in % 
N 
o 


10 46% — 53% SP ex 47% 5.8% cb 
53 (n= 56) (n= 185) % 2% 
(n= 46) (0559) (n= bei (147) 27? We Le ai d 

o ES Sa KN 22 5 e 
D. D. D. — -l 
z = 5 Ce z u z Lu O 
9 - x + ba > 
9 5 9 > S 


Figure 3.90: Frequency distribution of the 7 major grammatical struc- 
tures of non-final IUs, within unchunked IUs: numbers are rounded to 
one decimal place. 


and these seven grammatical units also describe 78% of all singleton IUs. The 
difference in distribution mainly pertains to verbs and connectives. Verbs occur 
less frequently as singleton IUs than as embedded IUs; 25.6% of embedded IUs 
are verbs, but only 22.5% of singleton IUs are verbs. Connectives, on the other 
hand, occur considerably less frequently as singleton IUs. 

Note that the absolute numbers of IUs and embedded IUs consisting of an 
NP are lower here than the number of nominal IUs in Table 3.72 above. This is 
because the counts here are more conservative than above. Nominal IUs above 
include cases where the nominal element co-occurs with a connective or a rela- 
tive clause, while the counts here consider only those singleton or embedded IUs 
that consist of a single NP only. 

In summary, the results show that the distribution of grammatical units typ- 
ically found in non-final [Us and singleton IUs is very similar. Elements that 
regularly constitute a separate IU in a complex CIU are also found as unchunked 
singleton IUs. 


3.3.4 Discussion 


In §3.3.1 above, I explored the grammatical units typically found in a CIU. I 
showed that 52.7% of CIUs in the corpus are clausal, 15.9% are nominal and 
13.7% are interactional. Crucially, I found that proportions vary between conver- 
sational and monological data. In §3.3.2, I briefly illustrated some of the major 
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aspects with regard to CIU-internal chunking and the major types of grammatical 
units found in embedded IUs of CIUs. Finally, in §3.3.3, I compared grammatical 
structures found in singleton [Us with those found in embedded IUs. I showed 
that the same grammatical units found in embedded IUs of complex CIUs also 
frequently occur as unchunked singleton IUs. Hence, neither the syntactic con- 
tent, nor the tonal markings of singleton IUs and embedded IUs of CIUs differ. 
If they were of a different category, I would expect the units to differ also with 
regard to their syntactic content. This is clearly the case in the analysis of e.g. 
French and Korean intonation (Jun & Fougeron 2000, Jun 2005a) for which both 
the level of the IU and the lower-level Accentual Phrase are assumed, of which 
the latter usually only contains one word. In light of these results, I conclude that 
there is clear evidence that we have to assume recursive embedding of [Us into 
complex CIUs in order to describe the data of Totoli as proposed in the model 
presented in §3.2. 

To conclude this discussion, I will briefly address the distribution of IU-final 
boundary-tone complexes and review some of the factors that may influence the 
choice of either complex. Consider Figure 3.91, which provides examples of adver- 
bial clauses, noun phrases, and verbs, and illustrates their occurrence with one of 
the boundary-tone complexes. These three constituents are significant syntactic 
components, and I will use them to compare the factors affecting the choice of 
a boundary tone. The right-hand columns in each pair indicate the distribution 
of complex CIUs in embedded IUs, while the left-hand columns indicate their 
distribution in simple, singleton IUs. 


ADVCL NP v 
e 100% 
£ 
on 
8 7. 
g 75% boundary-tone 
E complex 
R emm EZ 
Oo 
S | H 
E 0, 
2 25% Ll LHL% 
bel 
2 
E 
2 
D 0% 


singleton lUs embedded IUs singleton lUs embedded IUs singleton lUs embedded IUs 


Figure 3.91: Distribution of the three boundary-tone complexes in sin- 
gleton IUs and embedded IUs of CIUs, exemplified with the three gram- 
matical categories AdvCI, NP, and VP 
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The figure illustrates that when adverbial clauses occur as singleton IUs, they 
usually take the L-H% boundary-tone complex. However, when embedded in a 
compound IU, many adverbial clauses take the LH-H% pattern. This preference 
also applies to adverbial clauses that are embedded in CIUs, although many also 
occur with the LH-H pattern in this position. Noun phrases and verbs occurring 
as singleton IUs show a strong preference for the LH-L% pattern, although other 
boundary-tone complexes are possible. When occurring as embedded IUs of a 
complex CIU, they tend towards the LH-H% pattern. The tendency for a rise 
pattern in embedded IUs of complex CIUs is not surprising, as these [Us are 
part of a larger unit, and the final rise pattern indicates non-finality within the 
Compound IU. 

The question then arises as to why there are two different rise patterns, i.e. L- 
H% and LH-H%. One possible explanation is that the difference between the three 
patterns LH-L%, L-H%, and LH-H% correlates with degrees of integration. The 
LH-H% pattern represents “full integration” and is regularly used with verbs and 
noun phrases in non-final positions of complex IUs. The L-H% pattern and the 
LH-L% pattern would then be reserved for clause-external constituents. Adver- 
bial clauses may occur with the LH-H% full-integration pattern; however, about 
half of them occur with the L-H% pattern. 

Above, I showed an example of a left-dislocated topic, which involved the LH- 
L% pattern (cf. example (64)). The dislocated element usually involves the LH-L% 
pattern. Other instances involve right-dislocation, afterthoughts or appositive 
NPs. Example (66) shows such an instance. The NP siritana ia ‘this story’ is taken 
up again after the verb maalin ‘to get lost’, first by the pronoun ia ‘PRX’ and then 
by the appositive NP baran ia ‘this thing’. The verb bears the LH-L% boundary- 
tone complex and is immediately followed by the pronoun ia which takes up the 
NP siritana. 

In the preceding section, I presented an example of a focused constituent, 
which involved the LH-L% pattern (cf. example (64)). The focused element typi- 
cally involves the LH-L% pattern, while other instances involve right-dislocation, 
afterthoughts, or appositive NPs. Example (66) illustrates such an instance. The 
NP siritana ia ‘this story’ is taken up again after the verb maalin ‘to get lost’, first 
by the pronoun ia PRX, and then by the appositive NP baran ia ‘this thing’. The 
verb bears the LH-L% boundary-tone complex and is immediately followed by 
the pronoun ia, which takes up the NP siritana ‘this story’. 
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NA 
Yn, 


siritana ia ` geimo daan lau mokodo0n maalin ia baran ia 


0 1 00 0 2000 3000 


Time 


Figure 3.92: Periogram with pitch track (in st) for example (66), speaker 
SYNO 


siritana | ia geima daan | lau makadzang maaling | ia baran ia 

sirita=na ia geimodaan lau moko-doon mo-alin ia 
story=3S.GEN PRX not EXIST presently sT.AV-want sT-disappear PRX 
baran ia 

goods PRX 

‘This story will never again get lost; this thing: 


(explanation-lelegesan SYNO.007) > 


The guestion of what conditions the choice of either boundary-tone complex 
will remain a topic for further research. 
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The present study aimed to achieve two primary goals: Firstly, to provide a de- 
tailed discussion of the prosody and intonation of Totoli, adopting a comprehen- 
sive approach that combined experimental evidence with data obtained from an 
impressionistic analysis of a wide-ranging corpus of (semi-)spontaneous speech. 
Secondly, to explore the grammatical structures that are typically found in (Com- 
pound) Intonation Units, which include singleton IUs, embedded IUs of CIUs, and 
complex CIUs as a whole. 

Regarding the first objective, I have demonstrated that prominence is not a 
relevant concept in the prosody of Totoli, and that focus is not signaled by any 
prosodic cues. This was supported by evidence obtained from two experiments. 
The first experiment, which involved an RPT study, showed that naive native 
speakers generally could not agree on the location of prominences, suggesting 
that prominence may not be a significant category in Totoli’s prosody. The results 
of the second experiment further supported this hypothesis, as no evidence was 
found that prosodic means marked focus as an information-structural category. 

As Himmelmann & Kaufman (2020: 376) have noted, narrow focus on a sub- 
constituent of a clause or noun phrase in languages is typically not signaled by 
intonation alone but rather by syntactic means. However, narrow focus on a 
constituent such as the subject NP or the object NP has only been investigated 
to a limited extent in Austronesian languages (cf. for example Nagaya & Hwang 
2018, Kaufman 2005, Kaland et al. 2023). In this study, I have shown that in Totoli, 
syntactically equal SVO(O) constructions are not prosodically marked for focus 
when used as answers to questions that trigger focus on different constituents. 
This feature may be present in many other Austronesian languages, but addi- 
tional data from a variety of Austronesian languages are required to determine 
whether it is a common feature or limited to a specific subgroup of Austronesian 
languages. 

In §3.2, [examined the tonal patterns of the entire corpus of (semi-)spontaneous 
speech, consisting of 2861 Intonation Units. Based on the analysis of the tonal 
specifications, I presented a model of the Compound Intonation Unit in Totoli. 
This model assumes either singleton IUs or complex Compound IUs (CIUs). In 
the former, the only pitch event is the IU-final boundary complex, which occurs 
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on the final two syllables. In the latter, the CIU comprises a series of two or more 
IUs, each of which bears one of the three IU-final boundary-tone complexes. 

My analysis showed that the Totoli prosody is better described by assuming 
recursive embedding of IUs into CIUs rather than parsing of IUs into prosodic 
units at a level below the IU, as Himmelmann’s (2018: 348) model of the IU in 
Austronesian languages of Indonesia and East Timor suggests. Himmelmann’s 
model suggests that Intonation Units are further parsed into smaller prosodic 
units, such as intermediate phrases, and tonal patterns delimiting them consis- 
tently occur at the boundaries of major syntactic units. However, I found that 
the tonal patterns in my data are essentially the same, although with an inverted 
distribution. I have demonstrated that an embedded IU of a CIU differs substan- 
tially from a prosodic word or what is labeled as Accentual Phrase in Korean or 
French. 

The absence of word prosody and the assignment of tone complexes to bound- 
aries of prosodic domains fit the characteristics of what Féry (2016: 270) labels 
Phrase Languages: 


Phrase languages resemble intonation languages in that their tonal speci- 
fications are mostly assigned at the level of p-phrases and t-phrases. But 
contrary to intonation languages, specifications at the level of the word 
are sparse, absent or only weakly implemented. Phrase languages do not 
automatically associate pitch accents with stressed syllables, most tones 
are nonlexical (or ‘postlexical’). (Fery 2016: 270) 


In fact, many Austronesian languages may fall under the category Phrase Lan- 
guages, following Himmelmann’s (2018: 347) assertion: 


[...] it seems likely that prosodic prominence does not have a major role to 
play in marking information-structural categories. If at all, prosodic phras- 
ing may be of relevance in this regard inasmuch as it is not determined by 
syntactic or processing constraints. 


Further evidence for recursive embedding of IUs into CIUs comes from an anal- 
ysis of the grammatical structures that IUs typically contain. I found that a small 
set of categories suffices to describe the majority of their content. I compared 
the grammatical units typically occurring in embedded IUs with those that oc- 
cur in singleton [Us which are not further segmented and I found that they are 
essentially similar, although again with varying proportions. 

In sum, tonal patterns at the edges of singleton IUs and final IUs of CIUs are 
similar to those occurring at the right edge of non-final embedded IUs of CIUs. 
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The syntactic structures they contain also occur as simple, singleton IUs which 
are not further chunked. In light of these results, I concluded that singleton [Us 
and embedded IUs of CIUs are essentially of the same nature with the major 
difference being the presence or absence-—i.e. the strength—of further typical 
boundary phenomena such as pitch reset, final lengthening, pauses and glottal- 
ization. A systematic analysis of boundary strength remains an object for future 
studies. Furthermore, although the tonal events at the edges of IUs are the same, 
it might be the case that they vary with regard to tonal scaling. That is, tonal 
events at the right-edge boundary of a CIU may be essentially the same as at the 
right-edge boundary of embedded IUs within CIUs but may vary in their tonal 
scaling (Riad 2018). This is an aspect which I have not systematically investigated 
here, and which presents a promising avenue for further research. 

This research also opens many other questions. First, what does determine the 
choice of the final boundary tone of those IUs which are part of a CIU? Speakers 
are consistent in their choice of an IU-final boundary tone, and the grammatical 
unit contained in an IU appears to trigger the choice of boundary-tone complex. 
I suggested that the different patterns might be explicable by different degrees of 
integration, though the explanation for these different patterns requires further 
research. Second, what does trigger the realization of two grammatical units as 
either two separate [Us or a single complex one? Verbs followed by an NP often 
occur as a single IU. This is also observed with verbs followed by a PP, yet the 
tendency appears to be less strong. The analysis of the intonation of Totoli in 
Chapter 3 focused on the tonal patterns exclusively. Further investigations are 
needed in order to correlate the different boundary-tone complexes with other 
acoustic phenomena such as, for example, duration, intensity and possibly spec- 
tral tilt, voice quality. A particularly fruitful approach may be the description of 
the intonation of Totoli with the attractor-based model that encompasses cate- 
gorical and also continuous components, but also accounts for the variation of 
their frequency (Roessig et al. 2019, Roessig 2021). 

Little is known about the prosody and intonation of Austronesian languages. 
The study presented here pertains to one language in the region, and many of the 
results may well apply to other languages in the area. This study represents one 
of the most comprehensive investigations into the prosody and intonation of any 
Austronesian language to date. Further research on other languages is necessary 
to relate the reported results and insights to other languages in the region, and 
to determine which features are specific to Totoli and which are common to the 
region or the language family as a whole. 
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Appendix A: Appendix 


A.1 RPT Experiment 


A.1.1 Recorded speaker information 
Table A.1 shows the speaker information of the RPT experiment discussed in §2.1. 


Table A.1: Information of recorded speakers for the stimulus of the RPT 
experiment discussed in §2.1 


N SPEECH 
ID ORIGIN GENDER YEAR OF BIRTH 
SAMPLES 
FAH Nalu m 1964 14 
IRN Pinjan m 1967 4 
RSTM  Dadakitan m 1966 17 
SNG Nalu m 1940 15 
SP Nalu m 1958 21 


A.1.2 Participant information 


Table A.2 shows the speaker information of the RPT experiment discussed in §2.1. 
The third column in the table refers to the participants’ place of residence at the 
time of data collection, which almost always corresponds to the location where 
they grew up. 


A.1.3 Instructions of boundary marking task 


A.1.3.1 Indonesian original 


Ketika seseorang berbicara, dia akan membagi ucapan mereka menjadi potongan- 
potongan. Potongan-potongan tersebut membentuk kelompok kata-kata yang 
memudahkan pendengar untuk memahami ucapan pembicara. Potongan-potongan 
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Table A.2: Speaker information of participants of RPT experiment dis- 
cussed in §2.1 


SPEAKER YEAR OF BIRTH PLACE OF LIVING GENDER 


AKR 1990 Nalu m 
BSTN 1976 Nalu m 
DHL 1988 Nalu f 
DT 1989 Nalu m 
DDN 1988 Nalu m 
FSL 1994 Binontoan m 
FTR1 1994 Binontoan f 
FTR2 1991 Binontoan f 
IFS 1986 Binontoan m 
IM 1972 Kalangkangan f 
MG 2000 Pinjan m 
NRBT 1983 Nalu f 
OCH 1994 Binontoan f 
RMDN 1994 Nalu m 
RID 1998 Binontoan f 
RST 1983 Nalu m 
RDT 1981 Nalu m 
SRN 1985 Nalu f 
STDI 1988 Kalangkangan m 
WN 1979 Kalangkangan m 


tersebut penting terutama saat pembicara memproduksi ucapan yang panjang. 


Contoh potongan yang mungkin Anda ketahui adalah potongan nomor ketika 
Anda memberi tahu nomor telepon Anda kepada orang lain. Biasanya, Anda 
tidak setiap kali memberi satu nomor (0, 8, 1, 3 ...), tetapi Anda akan memotong 
nomor hp tersebut menjadi kelompok-kelompok yang terdiri atas dua, tiga, atau 
empat angka (081, 358, 772 ...). 


Untuk rekaman yang akan Anda dengar, Anda diminta untuk menandai bagian 
yang terdengar sebagai satu potongan. Dengan mengklik kata terakhir dari uca- 
pan tersebut, Anda dapat menetapkan batas, yang kemudian muncul di belakang 
kata yang diklik. Batas antara dua potongan tidak harus sama dengan lokasi tem- 
pat Anda akan menulis tanda koma, titik, atau tanda baca lainnya. Jadi, Anda 
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harus benar-benar hati-hati mendengar ujaran dan tandai batas yang Anda den- 
gar sebagai akhir sebuah potongan. 


Jawaban yang Anda berikan tidak ada yang salah atau benar karena semuanya 
bergantung pada rasa bahasa. 


Jika Anda mau memperbaiki pilihan Anda, Anda dapat mengklik kata tersebut 
untuk kedua kalinya, dan btas yang menjadi pilihan awal Anda akan lenyap. 


Sebuah potongan mungkin saja berupa satu kata, atau mungkin terdiri atas 
beberapa kata, dan ukuran (jumlah kata) dalam setiap potongan dari para pem- 
bicara bisa saja berbeda-beda dalam satu ujaran. Beberapa ujaran mungkin Anda 
dengar konsisten, yaitu terdiri atas satu potongan saja. Jika demikian, Anda tidak 
perlu menandai batas potongan. 


Anda dapat memutar setiap rekaman kalimat sebanyak dua kali. Akan tetapi, 
tidak memungkinkan untuk menghentikan rekaman pada saat contoh kalimat 
sedang diputar. 

Contoh: 


0811358772... 
0813/587712... 
Bapak saya | sudah datang 
Bapak | saya sudah datang 


Selamat mengikuti eksperimen ini! 

Silahkan menandai bagian yang Anda dengar sebagai satu potongan. Dengan 
mengklik kata terakhir salah satu potongan, batas akan muncul di belakang kata 
yang diklik. 


A.1.3.2 English translation 


When someone speaks, they divide their speech into segments. These segments 
form groups of words that make it easier for the listener to understand the speaker’s 
message. These segments are especially important when the speaker produces 
long utterances. 


An example of segments that you may be familiar with is the number segments 
when you give your phone number to someone else. Usually, you do not give the 
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number one digit at a time (0, 8, 1, 3...), but you break up the phone number into 
groups consisting of two, three, or four digits (081, 358, 772...). 


For the recordings that you will hear, you are asked to mark the sections that 
sound like one segment. By clicking on the last word of the speech segment, you 
can set a boundary, which will then appear behind the clicked word. The bound- 
ary between two segments does not necessarily have to be in the same location 
as where you would write a comma, period, or other punctuation marks. There- 
fore, you must listen carefully to the speech and mark the boundary that you 
hear as the end of a segment. 


The answer you provided is neither right nor wrong, as it all depends on one’s 
sense of language. 


If you wish to revise your selection, you can click on the word again, and the 
boundary that was your initial choice will disappear. 


A segment may consist of a single word or may be made up of several words, 
and the size (number of words) of each segment from the speakers may vary 
within one utterance. Some utterances may sound consistent, consisting of only 
one segment. If so, you do not need to mark any segment boundaries. 


You can play each recorded sentence twice. However, it is not possible to stop 
the recording while the sample sentence is playing. 
Example: 


081/358/772... 

0813|5877|2... 
Bapak saya | sudah datang (approx. “My father already came home») 
Bapak | saya sudah datang (approx. “Father, I already came home”) 


Enjoy the experiment! 
Please mark the chunks you hear as one unit. By clicking on the last word of one 
of the chunk, a boundary will appear behind the clicked word. 
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A.1.4 Instructions of prominence marking task 
A.1.4.1 Indonesian original 


Dalam berbicara seseorang akan mengucapkan beberapa atau banyak kata dalam 
sebuah kalimat dengan nada yang lebih menonjol dibandingkan dengan kata- 
kata lain yang terdapat dalam kalimat tersebut. Kata-kata dengan nada yang 
menonjol ini biasanya dapat dirasakan oleh pendengarnya. Tugas Anda adalah 
menandai (mewarnai) kata-kata yang nadanya Anda dengar lebih menonjol diband- 
ingkan dengan kata-kata lain dalam rekaman kalimat yang akan Anda putar. 


Berikut ini Anda akan diputarkan 71 kalimat. Setiap kalimat juga akan dis- 
ajikan dalam bentuk tertulis. 


Tugas Anda adalah mewarnai semua kata yang nadanya Anda anggap lebih 
menonjol (mis. lebih tinggi) dibandingkan dengan kata-kata lain pada setiap reka- 
man kalimat yang Anda dengarkan. Untuk mewarnai kata, silakan menklik kata 
tersebut dan warnanya akan berubah menjadi merah: 


Dia melihat sapi 


Dalam hal ini, Anda dimungkinkan untuk memilih lebih dari satu kata pada se- 
tiap rekaman kalimat! 


Dia melihat sapi dan kuda makan rumput 
Jika Anda mau memperbaiki pilihan Anda, Anda dapat mengklik kata tersebut 
untuk kedua kalinya, dan kata yang menjadi pilihan awal anda akan kembali 


berubah warna menjadi hitam. 


Anda dapat memutar setiap rekaman kalimat sebanyak dua kali. Akan tetapi, 
tidak memungkinkan untuk menghentikan rekaman pada saat contoh kalimat 
sedang diputar. 


Selamat mengikuti eksperimen ini! 


Tandai kata-kata yang terdengar lebih menonjol untuk Anda. 
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A Appendix 


A.1.4.2 English translation 


When speaking, individuals often emphasize certain words in a sentence through 
variations in tone. These prominent words are typically noticeable to the listener. 
Your objective is to identify and highlight (using color) the words in a recorded 
sentence where the speaker’s tone stands out in comparison to the other words. 


You will hear 71 sentences. You will also be provided with each sentence as a 
written transcript. 


Your task is to color all the words that you deem to to stand out more (e.g. 
higher tone) compared to the other words in each recorded sentence that you 


listen to. You will also be provided with a written transcript of each sentence. To 
color a word, please click on the word and it will turn red: 


S/he sees a cow 


In this case, it is possible for you to choose more than one word in each recor- 
ded sentence! 


S/he sees a cow and a horse eating grass 


If you need to revise your selection, click on the word again, and it will revert 
back to its original color black. 


You can play each recording twice. It will not be possible to stop the recording 
while it is playing. 


Enjoy the experiment! 


Mark the words that sound more prominent to you. 


A.2 Focus marking 


A.2.1 Recorded speaker information 


Table A.3 shows the speaker information of the RPT experiment discussed in 
§2.2. 
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A.2 Focus marking 


Table A.3: Information of recorded speakers for the stimulus of the 
Focus Marking experiment discussed in §2.2 


SPEAKER ORIGIN GENDER YEAR OF BIRTH 


AKR Nalu m 1990 
DT Nalu m 1989 
FTR Binontoan f 1994 
IFS Binontoan m 1986 
SP Nalu m 1958 
ZHRM Tambun m 1965 


A.2.2 Participant information 


Table A.4 shows the participant information of the focus marking experiment 
discussed in $2.2.3. The third column in the table refers to the participants' place 
of residence at the time of data collection, which almost always corresponds to 
the location where they grew up. 


A.2.3 Stimuli 


Examples (1)-(9) are the QA pairs that were used in 82.2. 


G) a. 


(2) a 


inan noninum sapa? 

inan noN-inum sopa 
mother Av.RLS-drink what 
“What does the mother drink?’ 
inan naninum aga! 

inay = noN-inum 29 
mother Av.RLS-drink water 
‘The mother drinks water. 
isei noninum 2997 

isei noN-inum ogo 

who Av.RLS-drink water 
“Who drinks the water?” 
inay naninum ago! 

inan = noN-inum 239 
mother Av.RLS-drink water 


‘The mother drinks water. 
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Table A.4: Participant information of focus marking perception exper- 
iment discussed in §2.2.3 


SPEAKER YEAR OF BIRTH PLACE OF LIVING GENDER 


AM 1978 Binontoan m 
ANDR 1997 Dapalak f 
AAL 1988 Nalu m 
BLW 1994 Binontoan m 
DAT 1989 Nalu m 
DWS 1994 Binontoan f 
EKW 1989 Binontoan f 
HLM 1986 Binontoan f 
IFS 1986 Binontoan m 
IRM 1972 Laulalang f 
IWRM 1978 Binontoan m 
ISRW 1999 Gio f 
JMTR 1993 Binontoan f 
SRMN 1986 Nalu f 
NSK 1980 Binontoan f 
NRM 1981 Binontoan f 
MRB 1994 Nalu m 
SRM 1985 Nalu f 
WN 1979 Kalangkangan m 
YK 1985 Binontoan m 


(3) a. sapa niinum inan? 
sapa ni-inum-0 inap 
what RLS-drink-uv mother 
“What does the mother drink?’ 
b. ago niinum inan! 
»g2 ni-inum-0 inap 
water RLs-drink-uv mother 


“The mother drinks water: 


(4) a. ago niinum isei? 
203 ni-inum-0 isei 
water RLS-drink-uv who 
‘Who drinks the water?’ 
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(5) 


(6) 


(7) 


(8) 


. isei niinum inay! 


»g9 ni-inum-0 inay 
water RLS-drink-uv mother 
“The mother drinks water: 


. sapa nalugud sesen? 


sopa noN-lugud sesen 
what Av.RLS-chase cat 


Who/What chases the cat?’ 


. deuk nolugud sesen! 


deuk noN-lugud sesen 
dog Av.RLs-chase cat 


“The dog chases the cat: 


. deuk nolugud sopa? 


deuk noN-lugud sopa 
dog Av.RLs-chase what 


“What does the dog chase?” 


. deuk nolugud sesen! 


deuk noN-lugud seseg 
dog Av.RLs-chase cat 


“The dog chases the cat: 


. manana dolago nemeenan buuk dei isei? 


manana dolaga noN-been-an buuk dei isei 
child girl Av.RLs-give-APPL book Loc who 


‘Who does the girl give the book to?’ 


. manana dolago nemeenan buuk dei inayna! 


A.2 Focus marking 


manana dolago noN-been-an buuk dei inan=na 
child girl Av.RLS-give-AppL book Loc mother=3.sG 


‘The girl gives the book to her mother’ 


. manana dolaga nemeenan sopa dei inanna? 


manana dolaga noN-been-an sopa dei inan=na 
child girl  Av.RLs-give-APPL what Loc mother=3.sG 


“What does the girl give to her mother?’ 
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b. manana dolago nemeenan buuk dei inanna! 
manana dolago noN-been-an buuk dei inan=na 
child girl Av.RLs-give-APPL book Loc mother=3.sG 
‘The girl gives the book to her mother’ 


(9) a. isei nemeenan buuk dei inanna? 
isei noN-been-an buuk dei inan=na 
who AV.RLS-give-APPL what Loc mother=3.5G 
“Who gives the book to the mother?’ 


b. mayana dolago nemeenan buuk dei inanna! 
manana dolago noN-been-an buuk dei inan=na 
child girl Av.RLs-give-APPL book Loc mother=3.sG 
‘The girl gives the book to her mother’ 


A.2.4 Instructions 
A.2.4.1 Indonesian original 


Please listen carefully. 

You will hear two question-answer pairs. 

Only one of them is correct! 

Your task is to choose a compatible pair. 

You will hear each question-answer pair twice. 

After the second time, you should choose the one that sounds more compatible. 
Which pair is more compatible? 


A.2.4.2 English translation 


Tolong mendengar dengan seksama. 

Anda akan mendengar dua pasangan pertanyaan-jawaban. 

Hanya salah satunya adalah yang benar! 

Tugas Anda adalah memilih pasangan yang cocok. 

Anda akan mendengar setiap pasangan pertanyaan-jawaban dua kali. 

Setelah kedua kalinya, Anda harus pilih salah satu yang kedengarannya lebih 
cocok. 

Pasangan yang mana lebih cocok? 


160 


A.3 The corpus 


A.3 The corpus 
Table A.5: Overview of conversational recordings 


FILENAME SPEAKER GENDER NIUS DURATION 


QUIS-animalgame RSM, AKR m,m 112 00:03:08 
spacegames KSR, SP m,m 1215 00:37:00 


1327 00:40:08 


Table A.6: Overview of monological recordings 


FILENAME SPEAKER GENDER NIUs DURATION 
explanation_lelegesan SYNO m 164 00:06:51 
explanation_wedding-tradition ZBR m 321 00:16:13 
pearstory SP m 46 00:02:03 
pearstory RSTM m 192 00:06:23 
pearstory RD f 72 00:03:11 
pearstory SP m 51 00:02:33 
pearstory IRN m 31 00:01:27 
pearstory MLI f 44 00:04:22 
pearstory SNG m 131 00:05:57 
pearstory SELP f 70 00:02:38 
pearstory FAH m 74 00:03:09 
QUIS-animalgame SP m 89 00:11:09 
QUIS-focus SP m 41 00:04:37 
lifestory RDA m 198 00:07:35 
story-monkey-butterfly RSM m 69 00:02:25 
story-monkey-crocodile RSM m 47 00:01:35 
story-monkey-python RSM m 27 00:02:33 
story-monkey-turtle RSM m 72 00:02:14 
story-session MMN f 88 00:03:13 


1899 01:30:08 
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This book is an investigation into aspects of prosody, intonation and the prosody-syntax 
interface in Totoli, an endangered Austronesian language. With a strongly data-driven 
approach, the study integrates a combination of experimental evidence from both pro- 
duction and perception with corpus-based evidence through descriptive and inferential 
statistics. 

The study takes the prime structuring unit of speech - the Intonation Unit — as its 
principal unit of investigation. It presents a thorough description of the IU, develops 
an intonational model of it, and investigates the syntactic units it contains. The author 
argues that the data is best analysed by assuming recursive embedding of Intonation 
Units into Compound Intonation Units. 

This research represents a significant advancement in our understanding of the na- 
ture of prosodic systems found in the languages of the region and in intonational sys- 
tems in general. It is one of the few investigations into the intonation of Austronesian 
languages and its analytical proposals are relevant both to prosodic theory and to phono- 
logical typology. 


